Post on 31-Mar-2015
”Big Data” Initiative as an IT Solution for Improved Operation and Maintenance of Wind Turbines
Zsolt János Viharos, Csaba István Sidló, András A. Benczúr, János Csempesz, Krisztián Balázs Kis, István Petrás, András Garzó
Computer and Automation Research Institute of the Hungarian Academy of Sciences (MTA SZTAKI)
Laboratory on Engineering and Management Intelligence and Informatics Laboratory
Big Data Business Intelligence Research Group
-30
-20
-10
0
10
20
30
40
50
60
70
80
90
100
110
120
130
0
10
20
30
40
50
60
70
80
90
100
110
Mo
de
l e
stim
ati
on
err
or
(%)
[lim
it:
+/
-17
%]
Tem
pe
ratu
res
Time - a year
Non-conform situation detection - estimation of the gearbox bearing temperature by a neural network modell
(Model validity: ambient temperature between 4 and 10 C)
Values_for_Model_INPUT_2 Values_for_Model_INPUT_1
Gearbox bearing temperature_MODEL_ESTIMATES Gearbox bearing temperature_MEASURED
Ambient temperature (for model vaildity) Error_%
PLCs
SCADA
Local compute
rCentral server
Parameters/TimeStamp
parameter1
parameter2
parameter3
parameter4
parameter5
parameter6
parameter7
1. time point 98 98 34 34 29 59 572. time point 99 99 34 36 29 60 573. time point 97 97 34 40 29 60 574. time point 98 98 34 41 29 61 575. time point 86 86 34 41 29 61 576. time point 98 98 34 41 29 62 577. time point 98 98 35 43 29 63.75 588. time point 99 99 35 43 29 66 599. time point 97 97 35 44 29 66 5910. time point 98 98 35 44 29 66 5911. time point 98 98 35 44 29 66 5912. time point 99 99 35 44 29 65 5913. time point 97 97 35 44 29 65 5914. time point 98 98 35 44 29 64 5915. time point 86 86 35 44 30 63 5916. time point 98 98 35 44 30 63 5817. time point 100 100 34 43 30 63 5818. time point 102 102 34 43 30 62 5819. time point 104 104 34 42 30 62 5820. time point 102 102 34 42 30 62 58
Analysis
DWGenerated input data in
SQL SERVER
SQOOP
HADOOP
HIVEHIVE in HDFS
HIVE in HDFS
SQOOP
Aggregated data inSQL SERVER
SQL queries
NOSQL
SQL
Big Data intiativelayers
3
MySQL
PostgreSQLHadoop
Storm
InfoBright GreenPlumVertica
HBaseNetezza
MapR
VoltDB
OracleSQLServer
Cloudera
custom hardware
Matlab
R
SPSS
SASMahout
custom software
Revolution
GB Size PB
IT logs
fraud detection
wind turbine sensors
navigation, mobility
media pricing
Web content
online reputation
Size
BigAnalytics
FastData
Speed
Real time
Batch
Data processing alternatives for wind farm data
4
SQL SQL
DW
SQL
DW
Big Data layer ETL
DW
Big Data layer
SQL adapter Streaming
Real time alarms
Present Present with DW Big Data with ETL Direct Big Data
Wind farm Wind farm Wind farm Wind farm Wind farm Wind farm Wind farm Wind farm
Data flow of the SQL and Big Data (NOSQL) prototypes
5
SQ
LS
QL
DW
SQ
L
DW
Big
Da
ta la
yer E
TL
DWB
ig D
ata
laye
r
SQ
L a
da
pte
rS
trea
min
g
Re
al tim
e
ala
rms
Present
Present w
ith DW
Big D
ata with E
TL
Direct B
ig Data
Wind
farm
Wind
farm
Wind
farm
Wind
farm
Wind
farm
Wind
farm
Wind
farm
Wind
farm
DWGenerated input data in
SQL SERVER
SQOOP
HADOOP
HIVEHIVE in HDFS
HIVE in HDFS
SQOOP
Aggregated data inSQL SERVER
SQL queries
NOSQL
SQL
Benchmarking data set●The task is to load a “heavy” aggregate wind farm data cube
●number of commands,●the number and average, minimum, maximum and standard deviation of the length of all alarms, warnings and events,
●number and average length of different statuses●minimum, maximum, average and standard deviation of 8 selected typical, mostly relevant signals.
6
DWGenerated input data in
SQL SERVER
SQOOP
HADOOP
HIVEHIVE in HDFS
HIVE in HDFS
SQOOP
Aggregated data inSQL SERVER
SQL queries
NOSQL
SQL
Data processing times (SQL vs. Big Data)
7
0
20
40
60
80
100
120
140
0 20 40 60 80 100 120 140 160 180
exec
utio
n tim
e (m
inut
es)
number of wind farms
SQL Server
Hadoop and Hive (2 nodes)
Hadoop and Hive (48 nodes)
SQ
LS
QL
DW
SQ
L
DW
Big
Da
ta la
yer E
TL
DWB
ig D
ata
laye
r
SQ
L a
da
pte
rS
trea
min
g
Re
al tim
e
ala
rms
Present
Present w
ith DW
Big D
ata with E
TL
Direct B
ig Data
Wind
farm
Wind
farm
Wind
farm
Wind
farm
Wind
farm
Wind
farm
Wind
farm
Wind
farm
20072008
20092010
2011
05000000
100000001500000020000000250000003000000035000000400000004500000050000000
70 71 72 73 74
Year
s
Cum
ulati
ve C
OM
MA
ND
num
ber
Wind farms
Cumulated COMMAND numbers for five wind farms by years
0
10
20
30
40
50
60
70
80
90
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
2008
01
2008
02
2008
03
2008
04
2008
05
2008
06
2008
07
2008
08
2008
09
2008
10
2008
11
2008
12
2008
Mon
thly
ave
rage
:N
acel
le te
mpe
ratu
re, R
ectiv
e ne
twor
k po
wer
,W
ind
spee
d
Mon
thly
ave
rage
pro
duce
d en
ergy
Months of 2008
Monthly average SCADA signal values at wind farm level
Produced energy
Wind turbine temperature
Network power
Wind speed
0
5
10
15
20
25
0
1000000
2000000
3000000
4000000
5000000
6000000
Mon
thly
ave
rage
of t
he n
acel
le's
tem
pera
ture
s
Cum
ulati
ve C
OM
MA
ND
num
ber
Year: 2008
Monthly cumulative COMMAND number and average wind turbine temperature for a wind farm in 2008
Command Number
Average Wind Turbine Temperature
Business Intelligence example reports
8
SQ
LS
QL
DW
SQ
L
DW
Big
Da
ta la
yer E
TL
DWB
ig D
ata
laye
r
SQ
L a
da
pte
rS
trea
min
g
Re
al tim
e
ala
rms
Present
Present w
ith DW
Big D
ata with E
TL
Direct B
ig Data
Wind
farm
Wind
farm
Wind
farm
Wind
farm
Wind
farm
Wind
farm
Wind
farm
Wind
farm
Contact
Dr. Zsolt János VIHAROS
MTA SZTAKIzsolt.viharos@sztaki.mta.hu www.sztaki.hu/~viharos
http://bigdatabi.sztaki.hu/
9