Research on Advanced Training Algorithms of Neural Networks Hao Yu Ph.D Defense Aug 17 th 2011...
-
Upload
loren-bryant -
Category
Documents
-
view
214 -
download
0
Transcript of Research on Advanced Training Algorithms of Neural Networks Hao Yu Ph.D Defense Aug 17 th 2011...
Research on Advanced Training Algorithms of Neural Networks
Hao YuPh.D DefenseAug 17th 2011
Supervisor: Bogdan Wilamowski Committee Members: Hulya Kirkici Vishwani D. Agrawal Vitaly Vodyanoy University Reader: Weikuan Yu
Outlines
• Why Neural Networks
• Network Architectures
• Training Algorithms
• How to Design Neural Networks
• Problems in Second Order Algorithms
• Proposed Second Order Computation
• Proposed Forward-Only Algorithm
• Neural Network Trainer
• Conclusion & Recent Research
What is Neural Network
• Classification: separate the two groups (red circles and blue stars) of twisted points [1].
What is Neural Network
• Interpolation: with the given 25 points (red), find the values of points A and B (black)
What is Neural Network
• Human Solutions
• Neural Network Solutions
What is Neural Network
• Recognition: retrieve the noised digit images (left) to original images (right)
Original ImagesNoised Images
What is Neural Network
• “Learn to Behave”
• Build any relationship between input and outputs [2]
Learning Process “Behave”
Why Neural Network
• What makes neural network different
Given Patterns (5×5=25) Testing Patterns (41×41=1,681)
Different Approximators
• Test Results of Different Approximators
Mamdani fuzzy TSK fuzzy Neuro-fuzzy SVM-RBF SVM-Poly
Nearest Linear Spline Cubic Neural Network
Matlab Function: Interp2
Comparison
• Neural networks behave potentially as the best approximator
Methods of Computational Intelligence Sum Square ErrorsFuzzy inference system – Mamdani 319.7334Fuzzy inference system – TSK 35.1627Neuron – fuzzy system 27.3356Support vector machine – RBF kernel 28.9595Support vector machine – polynomial kernel 176.1520Interpolation – nearest 197.7494Interpolation – linear 28.6683Interpolation – spline 11.0874Interpolation – cubic 3.2791Neural network – 4 neurons in FCC network 2.3628Neural network – 5 neurons in FCC network 0.4648
Outlines
• Why Neural Networks
• Network Architectures
• Training Algorithms
• How to Design Neural Networks
• Problems in Second Order Algorithms
• Proposed Second Order Computation
• Proposed Forward-Only Algorithm
• Neural Network Trainer
• Conclusion & Recent Research
A Single Neuron
• Two basic computations
)(xfy
x1
x3
x2
x6
x5
x4
+1
net
x7
0
7
1
wwxneti
ii
netfy
xgainxfy
1gain
xgainxfy tan
(1)
(2)
Network Architectures
• Multiplayer perceptron network is the most popular architecture
• Networks with connections across layers, such as bridged multiplayer perceptron (BMLP) networks and fully connected cascade (FCC) networks are much powerful than MLP networks.
• Wilamowski, B. M. Hunter, D. Malinowski, A., "Solving parity-N problems with feedforward neural networks". Proc. 2003 IEEE IJCNN, 2546-2551, IEEE Press, 2003.
• M. E. Hohil, D. Liu, and S. H. Smith, "Solving the N-bit parity problem using neural networks," Neural Networks, vol. 12, pp1321-1323, 1999.
• Example: smallest networks for solving parity-7 problem (analytical results)
MLP network
FCC networkBMLP network
Outlines
• Why Neural Networks
• Network Architectures
• Training Algorithms
• How to Design Neural Networks
• Problems in Second Order Algorithms
• Proposed Second Order Computation
• Proposed Forward-Only Algorithm
• Neural Network Trainer
• Conclusion & Recent Research
Error Back Propagation Algorithm
• The most popular algorithm for neural network training
• Update rule of EBP algorithm [3]
• Developed based on gradient optimization
• Advantages: – Easy
– Stable
• Disadvantages:– Very limited power
– Slow convergence
kk gw
321
,,w
E
w
E
w
Eg 321 ,, wwww
Improvement of EBP
• Improved gradient using momentum [4]
• Adjusted learning constant [5-6]
11 kkk wgw 10
1 kw
kw
kg1 kw
kg 1
1 kw
kw
B
A
Newton Algorithm
• Newton algorithm: using the derivative of gradient to evaluate the change of gradient, then select proper learning constants in each direction [7]
• Advantages:– Fast convergence
• Disadvantages:– Not stable– Requires computation of second order derivative
kkk gHw 1
ii w
Eg
P
p
M
mpmeE
1 1
2
2
1
2
2
2
2
1
2
2
2
22
2
12
21
2
21
2
21
2
NNN
N
N
w
E
ww
E
ww
E
ww
E
w
E
ww
E
ww
E
ww
E
w
E
H
Gaussian-Newton Algorithm
• Gaussian-Newton algorithm: eliminate the second order derivatives in Newton Method, by introducing Jacobian matrix
• Advantages:– Fast convergence
• Disadvantages:– Not stable
kTkk
Tkk eJJJw
1
JJH T
eJg T
N
MPMPMP
N
PPP
N
PPP
N
MMM
N
N
w
e
w
e
w
e
w
e
w
e
w
ew
e
w
e
w
e
w
e
w
e
w
e
w
e
w
e
w
ew
e
w
e
w
e
,
2
,
1
,
2,
2
2,
1
2,
1,
2
1,
1
1,
,1
2
,1
1
,1
2,1
2
2,1
1
2,1
1,1
2
1,1
1
1,1
J
MP
P
P
M
e
e
e
e
e
e
,
2,
1,
,1
2,1
1,1
e
Levenberg Marquardt Algorithm
• LM algorithm: blend EBP algorithm and Gaussian-Newton algorithm [8-9]
– When evaluation error increases, μ increase, LM algorithm switches to EBP algorithm
– When evaluation error decreases, μ decreases, LM algorithm switches to Gaussian-Newton method
• Advantages– Fast convergence
– Stable training
• Comparing with first order algorithms, LM algorithm has much more powerful search ability, but it also requires more complex computation
kkkkTkk eJIJJw
1
Comparison of Different Algorithms
• Training XOR patterns using different algorithms
XOR problem EBP α=0.1 α=10success rate 100% 18%average iteration 17845.44 179.00average time (ms) 3413.26 46.83
XOR problem EBPusing momentum
α=0.1 α=10m=0.5 m=0.5
success rate 100% 100%average iteration 18415.84 187.76average time (ms) 4687.79 39.27
XOR problem – EBP adjusted learning constantsuccess rate 100%
average iteration 170.23
average time (ms) 41.19
XOR problem – Gaussian-Newton algorithm
success rate 6%
average iteration 1.29
average time (ms) 2.29
XOR problem – LM algorithm
success rate 100%
average iteration 5.49
average time (ms) 4.35
Outlines
• Why Neural Networks
• Network Architectures
• Training Algorithms
• How to Design Neural Networks
• Problems in Second Order Algorithms
• Proposed Second Order Computation
• Proposed Forward-Only Algorithm
• Neural Network Trainer
• Conclusion & Recent Research
How to Design Neural Networks
• Traditional design:– Most popular training algorithm: EBP algorithm
– Most popular network architecture: MLP network
• Results:– Large size neural networks
– Poor generalization ability
– Lots of engineers move to other methods, such as fuzzy systems
How to Design Neural Networks• B. M. Wilamowski, "Neural Network Architectures and Learning Algorithms: How Not to Be
Frustrated with Neural Networks," IEEE Ind. Electron. Mag., vol. 3, no. 4, pp. 56-63, 2009.– Over-fitting problem
– Mismatch between size of training patterns and network size
• Recommended design policy: compact networks benefit generalization ability– Powerful training algorithm: LM algorithm
– Efficient network architecture: BMLP network and FCC network
2 neurons 3 neurons 4 neurons 5 neurons
6 neurons 7 neurons 8 neurons 9 neurons
Outlines
• Why Neural Networks
• Network Architectures
• Training Algorithms
• How to Design Neural Networks
• Problems in Second Order Algorithms
• Proposed Second Order Computation
• Proposed Forward-Only Algorithm
• Neural Network Trainer
• Conclusion & Recent Research
Problems in Second Order Algorithms
• Matrix inversion
– Nature of second order algorithms
– The size of matrix is proportional to the size of networks
– As the size of networks increases, second order algorithms may not as efficient as first order algorithms
1 IJJ T
Problems in Second Order Algorithms
• Architecture limitation• M. T. Hagan and M. Menhaj, "Training feedforward networks with the Marquardt algorithm". IEEE Trans. on Neural
Networks, vol. 5, no. 6, pp. 989-993, 1994. (citation 2474)
– Only developed for training MLP networks
– Not proper for design compact networks
• Neuron-by-Neuron Algorithm• B. M. Wilamowski, N. J. Cotton, O. Kaynak and G. Dundar, "Computing Gradient Vector and Jacobian Matrix in
Arbitrarily Connected Neural Networks", IEEE Trans. on Industrial Electronics, vol. 55, no. 10, pp. 3784-3790, Oct. 2008.
– SPICE computation routines
– Capable of training arbitrarily connected neural networks
– Compact neural network design: NBN algorithm + BMLP (FCC) networks
– Very complex computation
Problems in Second Order Algorithms
• Memory limitation:– The size of Jacobian matrix J is P×M×N
– P is the number of training patterns
– M is the number of outputs
– N is the number of weights
• Practically, the number of training patterns is huge and is encouraged to be as large as possible
• MINST handwritten digit database [10]: 60,000 training patterns, 784 inputs and 10 outputs. Using the simplest network architecture (1 neuron per output), the required memory could be nearly 35 GB.
• Limited by most of the Windows compiler.
1 IJJ T
N
PMPMPM
N
PPP
N
PPP
N
MMM
N
N
w
e
w
e
w
e
w
e
w
e
w
ew
e
w
e
w
e
w
e
w
e
w
e
w
e
w
e
w
ew
e
w
e
w
e
21
2
2
2
1
2
1
2
1
1
1
1
2
1
1
1
12
2
12
1
12
11
2
11
1
11
J
Problems in Second Order Algorithms
• Computational duplication– Forward computation: calculate errors
– Backward computation: error backpropagation
• In second order algorithms, both Hagan and Menhaj LM algorithm and NBN algorithm, the error backpropagation process has to be repeated for each output.
– Very complex
– Inefficient for networks with multiple outputs
...
... ... ......
+1 +1+1
Forward Computation
Backward Computation
Inpu
ts
Ou
tpu
ts
Outlines
• Why Neural Networks
• Network Architectures
• Training Algorithms
• How to Design Neural Networks
• Problems in Second Order Algorithms
• Proposed Second Order Computation
• Proposed Forward-Only Algorithm
• Neural Network Trainer
• Conclusion & Recent Research
Proposed Second Order Computation – Basic Theory
• Matrix Algebra [11]
• In neural network training, considering– Each pattern is related to one row of Jacobian matrix
– Patterns are independent of each other
P×M
P×MTJ J H
N
N
N
N
N
N TJ Jq
Multiplication Methods
Elements for storage
Row-column (P × M) × N + N × N + NColumn-row N × N + NDifference (P × M) × N
Row-column multiplication
Column-row multiplication
Memory comparison
Multiplication Methods
Addition Multiplication
Row-column (P × M) × N × N (P × M) × N × NColumn-row N × N × (P × M) N × N × (P × M)
Computation comparison
Proposed Second Order Computation – Derivation
• Hagan and Menhaj LM algorithm or NBN algorithm
eJIJJw TT 1
• Improved Computation
gIQw 1
2
21
2
2
212
121
2
1
N
pmpm
N
pmpm
N
pm
N
pmpmpmpmpm
N
pmpmpmpmpm
w
e
w
e
w
e
w
e
w
e
w
e
w
e
w
e
w
e
w
e
w
e
w
e
w
e
w
e
w
e
pmq
P
p
M
mpm
1 1
pm
N
pm
pm
pm
pmN
pm
pmpm
pmpm
pm e
w
e
w
ew
e
ew
e
ew
e
ew
e
2
1
2
1
η
P M
mpm
1 1p
ηg
pmTpmpm ejη
pmTpmpm jjq
N
pmpmpm
w
e
w
e
w
e
21pmj
N
PMPMPM
N
PPP
N
PPP
N
MMM
N
N
w
e
w
e
w
e
w
e
w
e
w
ew
e
w
e
w
e
w
e
w
e
w
e
w
e
w
e
w
ew
e
w
e
w
e
21
2
2
2
1
2
1
2
1
1
1
1
2
1
1
1
12
2
12
1
12
11
2
11
1
11
J
PM
P
P
M
e
e
e
e
e
e
2
1
1
12
11
e
Proposed Second Order Computation – Pseudo Code
• Properties:– No need for Jacobian matrix storage
– Vector operation instead of matrix operation
• Main contributions:– Significant memory reduction
– Memory reduction benefits computation speed
– NO tradeoff !
• Memory limitation caused by Jacobian matrix storage in second order algorithms is solved
• Again, considering the MINST problem, the memory cost for storage Jacobian elements could be reduced from more than 35 gigabytes to nearly 30.7 kilobytes
% InitializationQ=0;g =0% Improved computationfor p=1:P % Number of patterns % Forward computation … for m=1:M % Number of outputs % Backward computation … calculate vector jpm; calculate sub matrix qpm; calculate sub vector ηpm; Q=Q+qpm; g=g+ηpm; end;
end;
Pseudo Code
Proposed Second Order Computation – Experimental Results
• Memory Comparison
• Time Comparison
Parity-N Problems N=14 N=16Patterns 16,384 65,536
Structures 15 neurons 17 neuronsJacobian matrix sizes 5,406,720 27,852,800Weight vector sizes 330 425Average iteration 99.2 166.4
Success Rate 13% 9%Algorithms Actual memory cost
Traditional LM 79.21Mb 385.22MbImproved LM 3.41Mb 4.30Mb
Parity-N Problems N=9 N=11 N=13 N=15Patterns 512 2,048 8,192 32,768Neurons 10 12 14 16Weights 145 210 287 376
Average Iterations 38.51 59.02 68.08 126.08Success Rate 58% 37% 24% 12%Algorithms Averaged training time (s)
Traditional LM 0.78 68.01 1508.46 43,417.06Improved LM 0.33 22.09 173.79 2,797.93
Outlines
• Why Neural Networks
• Network Architectures
• Training Algorithms
• How to Design Neural Networks
• Problems in Second Order Algorithms
• Proposed Second Order Computation
• Proposed Forward-Only Algorithm
• Neural Network Trainer
• Conclusion & Recent Research
Traditional Computation – Forward Computation
• For each training pattern p• Calculate net for neuron j
• Calculate output for neuron j
• Calculate derivative for neuron j
• Calculate output at output m
• Calculate error at output m
pmpmpm doe
......
...
...
yi
...wi
epm=opm-dpm
yjsj
netj
...
)( jj netf )(, jjm yF mo2,jy1,jw
jy2,jw
ijw,
nijw ,
0,jw
1
1,jy
1, nijwijy ,
1, nijy
nijy ,
j,0
ni
iijijj wywnet
1,,
jjj netfy
j
jj
j
jj net
netf
net
ys
jjmm yFo ,
Traditional Computation – Backward Computation
• For first order algorithms• Calculate delta [12]
• Do gradient vector
• For second order algorithms• Calculate delta
• Calculate Jacobian elements
no
mmjmjj eFs
1
',
jijij
ij yw
Eg ,
,,
jmijij
mp yw
e,,
,
,
',, jmjjm Fs
Proposed Forward-Only Algorithm
• Extend the concept of backpropagation factor δ– Original definition: backpropagated from output m to neuron j
– Our definition: backpropagated from neuron k to neuron j
netj sj
yjnetk sk yk
jm,
jjkjk sF',,
km,
jkF ,'
netw
ork
inpu
ts
o1
om
netw
ork
ou
tput
s
',, jmjjm Fs
jjkjk sF ',,
Proposed Forward-Only Algorithm
• Regular Table– lower triangular elements: k≥j, matrix δ has triangular shape
– diagonal elements: δk,k=sk
– Upper triangular elements: weight connections between neurons
1
2
2
1 j
j
k
k
nn
nn
1,1NeuronIndex
2,2
jj,
kk ,
nnnn,
1,2
1,j 2,j
1,k 2,k jk,
1,nn 2,nn jnn, knn,
2,1w jw ,1 kw ,1 nnw ,1
jw ,2 nnw ,2kw ,2
kjw , nnjw ,
nnkw ,
1
,,,,
k
jijikikkjk w jk
jk kkk s,
jk 0, jk
Proposed Forward-Only Algorithm
• Train arbitrarily connected neural networks
5
4
1
2
3
6
Index 5 641 32
1
2
3
4
5
6
2s1s
3s
4s
5s
6s
1,2
1,3 2,3
1,4 2,4 3,4
2,5 3,51,5 4,5
1,6 2,6 3,6 4,6 5,6
4,1w2,1w 6,1w3,1w 5,1w
4,2w 5,2w 6,2w3,2w
6,3w4,3w 5,3w
6,5w5,4w 6,4w
Index 5 641 32
1
2
3
4
5
6
2s1s
3s
4s
5s
6s2,5 3,51,5 4,5
1,6 2,6 3,6 4,6
6,1w5,1w
5,2w 6,2w
6,3w5,3w
5,4w 6,4w
1
2
3
6
4
50 0 0
0 00
00
000
00 0
Index 5 641 32
1
2
3
4
5
6
2s1s
3s
4s
5s
6s
1,3
2,4
2,5 3,51,5 4,5
1,6 3,6
6,1w3,1w
4,2w 5,2w
6,3w5,3w
5,4w
00 0
000
0 0
0 00
0
0 00
1
2
3
4
5
6
Proposed Forward-Only Algorithm• Train networks with multiple outputs
• The more outputs the networks have, the more efficient the forward-only algorithm will be
1 output 2 outputs
3 outputs 4 outputs
Proposed Forward-Only Algorithm
• Pseudo codes of two different algorithms
• In forward-only computation, the backward computation (bold in left figure) is replaced by extra computation in forward process (bold in right figure)
for all patterns (np)% Forward computation for all neurons (nn) for all weights of the neuron (nx) calculate net; end; calculate neuron output; calculate neuron slope; set current slope as delta; for weights connected to previous neurons (ny) for previous neurons (nz)
multiply delta through weights then sum; end; multiply the sum by the slope;
end; related Jacobian elements computation; end; for all outputs (no) calculate error; end;end;
for all patterns% Forward computation for all neurons (nn) for all weights of the neuron (nx) calculate net; end; calculate neuron output; calculate neuron slope; end; for all outputs (no) calculate error; %Backward computation initial delta as slope; for all neurons starting from output neurons (nn) for the weights connected to other neurons (ny) multiply delta through weights sum the backpropagated delta at proper nodes end; multiply delta by slope (for hidden neurons); end; end;end;
Traditional forward-backward algorithm
Forward-only algorithm
Proposed Forward-Only Algorithm
• Computation cost estimation
• Properties of the forward-only algorithm– Simplified computation: organized in a regular table with general formula
– Easy to be adapted for training arbitrarily connected neural networks
– Improved computation efficiency for networks with multiple outputs
• Tradeoff – Extra memory is required to store the extended δ array
Hagan and Menhaj ComputationForward Part Backward Part
+/– nn×nx + 3nn + no no×nn×ny
×/÷ nn×nx + 4nn no×nn×ny + no×(nn – no)Exp nn 0
Forward-only computationForward Backward
+/– nn×nx + 3nn + no + nn×ny×nz 0
×/÷ nn×nx + 4nn + nn×ny + nn×ny×nz 0Exp nn 0
Subtraction forward-only from traditional
+/– nn×ny×(no – 1)
×/÷ nn×ny×(no – 1) + no×(nn – no) – nn×ny×nzexp 0
0 20 40 60 80 1000.4
0.5
0.6
0.7
0.8
0.9
1
The number of hidden neurons
Rat
io o
f tim
e co
nsum
ptio
n
Number of output=1 to 10
MLP networks with one hidden layer; 20 inputs
Proposed Forward-Only Algorithm
• Experiments: training compact neural networks with good generalization ability
Neurons
Success Rate Average Iteration Average Time (s)EBP FO EBP FO EBP FO
8 0% 5% Failing 222.5 Failing 0.339 0% 25% Failing 214.6 Failing 0.58
10 0% 61% Failing 183.5 Failing 0.7011 0% 76% Failing 177.2 Failing 0.9312 0% 90% Failing 149.5 Failing 1.0813 35% 96% 573,226 142.5 624.88 1.3514 42% 99% 544,734 134.5 651.66 1.7615 56% 100% 627,224 119.3 891.90 1.85
8 neurons, FOSSETrain=0.0044, SSEVerify=0.0080
8 neurons, EBPSSETrain=0.0764, SSEVerify=0.1271Under-fitting
12 neurons, EBPSSETrain=0.0018, SSEVerify=0.4909Over-fitting
Proposed Forward-Only Algorithm
• Experiments: comparison of computation efficiency
Computation methods
Time cost (ms/iteration) Relative timeForward Backward
Traditional 8.24 1,028.74 100.0%Forward-only 61.13 0.00 5.9%
Problems Computation Methods
Time Cost (ms/iteration) Relative TimeForward Backward
8-bit signal Traditional 40.59 468.14 100.0%Forward-only 175.72 0.00 34.5%
End Effector
α
β
L1
L2
Computation methods
Time cost (ms/iteration) Relative timeForward Backward
Traditional 0.307 0.771 100.0%Forward-only 0.727 0.00 67.4%ASCII to Images
Forward Kinematics [13]
Error Correction
Outlines
• Why Neural Networks
• Network Architectures
• Training Algorithms
• How to Design Neural Networks
• Problems in Second Order Algorithms
• Proposed Second Order Computation
• Proposed Forward-Only Algorithm
• Neural Network Trainer
• Conclusion & Recent Research
Software• The tool NBN Trainer is developed based on Visual C++ and used for training neural networks
• Pattern classification and recognition• Function approximation• Available online (currently free): http://www.eng.auburn.edu/~wilambm/nnt/index.htm
Parity-2 Problem
• Parity-2 Patterns
Outlines
• Why Neural Networks
• Network Architectures
• Training Algorithms
• How to Design Neural Networks
• Problems in Second Order Algorithms
• Proposed Second Order Computation
• Proposed Forward-Only Algorithm
• Neural Network Trainer
• Conclusion & Recent Research
Conclusion
• Second order algorithms are more efficient and advanced in training neural networks
• The proposed second order computation removes Jacobian matrix storage and multiplication. It solves memory limitation
• The proposed forward-only algorithm simplifies the computation process in second order training: a regular table + a general formula
• The proposed forward-only algorithm can handle arbitrarily connected neural networks
• The proposed forward-only algorithm has speed benefit for networks with multiple outputs
Recent Research
• RBF networks– ErrCor algorithm: hierarchical training algorithm– Network size increases based on the training information– No more trial-by-trial
• Applications of Neural Networks (future work)– Dynamic controller design
– Smart grid distribution systems
– Pattern recognition in EDA software design
References[1] J. X. Peng, Kang Li, G.W. Irwin, "A New Jacobian Matrix for Optimal Learning of Single-Layer Neural Networks," IEEE Trans. on
Neural Networks, vol. 19, no. 1, pp. 119-129, Jan 2008
[2] K. Hornik, M. Stinchcombe and H. White, "Multilayer Feedforward Networks Are Universal Approximators," Neural Networks, vol. 2, issue 5, pp. 359-366, 1989.
[3] D. E. Rumelhart, G. E. Hinton and R. J. Wiliams, "Learning representations by back-propagating errors," Nature, vol. 323, pp. 533-536, 1986 MA.
[4] V. V. Phansalkar, P.S. Sastry, "Analysis of the back-propagation algorithm with momentum," IEEE Trans. on Neural Networks, vol. 5, no. 3, pp. 505-506, March 1994.
[5] M. Riedmiller, H. Braun, "A direct adaptive method for faster backpropagation learning: The RPROP algorithm". Proc. International Conference on Neural Networks, San Francisco, CA, 1993, pp. 586-591.
[6] Scott E. Fahlman. Faster-learning variations on back-propagation: An empirical study. In T. J. Sejnowski G. E. Hinton and D. S. Touretzky, editors, 1988 Connectionist Models Summer School, San Mateo, CA, 1988. Morgan Kaufmann.
[7] M. R. Osborne, "Fisher’s method of scoring," Internat. Statist. Rev., 86 (1992), pp. 271-286.
[8] K. Levenberg, "A method for the solution of certain problems in least squares," Quarterly of Applied Machematics, 5, pp. 164-168, 1944.
[9] D. Marquardt, "An algorithm for least-squares estimation of nonlinear parameters," SIAM J. Appl. Math., vol. 11, no. 2, pp. 431-441, Jun. 1963.
[10] L. J. Cao, S. S. Keerthi, Chong-Jin Ong, J. Q. Zhang, U. Periyathamby, Xiu Ju Fu, H. P. Lee, "Parallel sequential minimal optimization for the training of support vector machines," IEEE Trans. on Neural Networks, vol. 17, no. 4, pp. 1039- 1049, April 2006.
[11] D. C. Lay, Linear Algebra and its Applications. Addison-Wesley Publishing Company, 3rd version, pp. 124, July, 2005.
[12] H. N. Robert, "Theory of the Back Propagation Neural Network," Proc. 1989 IEEE IJCNN, 1593-1605, IEEE Press, New York, 1989.
[13] N. J. Cotton and B. M. Wilamowski, "Compensation of Nonlinearities Using Neural Networks Implemented on Inexpensive Microcontrollers" IEEE Trans. on Industrial Electronics, vol. 58, No 3, pp. 733-740, March 2011.
Prepared Publications – Journals• H. Yu, T. T. Xie, Stanisław Paszczyñski and B. M. Wilamowski, "Advantages of Radial Basis
Function Networks for Dynamic System Design," IEEE Trans. on Industrial Electronics (Accepted and scheduled publication in December, 2011)
• H. Yu, T. T. Xie and B. M. Wilamowski, "Error Correction – A Robust Learning Algorithm for Designing Compact Radial Basis Function Networks," IEEE Trans. on Neural Networks (Major revision)
• T. T. Xie, H. Yu, J. Hewllet, Pawel Rozycki and B. M. Wilamowski, "Fast and Efficient Second Order Method for Training Radial Basis Function Networks," IEEE Trans. on Neural Networks (Major revision)
• A. Malinowski and H. Yu, "Comparison of Various Embedded System Technologies for Industrial Applications," IEEE Trans. on Industrial Informatics, vol. 7, issue 2, pp. 244-254, May 2011
• B. M. Wilamowski and H. Yu, "Improved Computation for Levenberg Marquardt Training," IEEE Trans. on Neural Networks, vol. 21, no. 6, pp. 930-937, June 2010 (14 citations)
• B. M. Wilamowski and H. Yu, "Neural Network Learning Without Backpropagation," IEEE Trans. on Neural Networks, vol. 21, no.11, pp. 1793-1803, Nov. 2010 (5 citations)
• Pierluigi Siano, Janusz Kolbusz, H. Yu and Carlo Cecati, "Real Time Operation of a Smart Microgrid via FCN Networks and Optimal Power Flow," IEEE Trans. on Industrial Informatics (under reviewing)
Prepared Publications – Conferences
• H. Yu and B. M. Wilamowski, "Efficient and Reliable Training of Neural Networks," IEEE Human System Interaction Conference, HSI 2009, Catania. Italy, May 21-23, 2009, pp. 109-115. (Best paper award in Computational Intelligence section) (11 citations)
• H. Yu and B. M. Wilamowski, "C++ Implementation of Neural Networks Trainer," 13th IEEE Intelligent Engineering Systems Conference, INES 2009, Barbados, April 16-18, 2009, pp. 237-242 (8 citations)
• H. Yu and B. M. Wilamowski, "Fast and efficient and training of neural networks," in Proc. 3nd IEEE Human System Interaction Conf. HSI 2010, Rzeszow, Poland, May 13-15, 2010, pp. 175-181 (2 citations)
• H. Yu and B. M. Wilamowski, "Neural Network Training with Second Order Algorithms," monograph by Springer on Human-Computer Systems Interaction. Background and Applications, 31st October, 2010. (Accepted)
• H. Yu, T. T. Xie, M. Hamilton and B. M. Wilamowski, "Comparison of Different Neural Network Architectures for Digit Image Recognition," in Proc. 3nd IEEE Human System Interaction Conf. HSI 2011, Yokohama, Japan, pp. 98-103, May 19-21, 2011
• N. Pham, H. Yu and B. M. Wilamowski, "Neural Network Trainer through Computer Networks," 24 th IEEE International Conference on Advanced Information Networking and Applications, AINA2010 , Perth, Australia, April 20-23, 2010, pp. 1203-1209 (1 citations)
• T. T. Xie, H. Yu and B. M. Wilamowski, "Replacing Fuzzy Systems with Neural Networks," in Proc. 3nd IEEE Human System Interaction Conf. HSI 2010, Rzeszow, Poland, May 13-15, 2010, pp. 189-193.
• T. T. Xie, H. Yu and B. M. Wilamowski, "Comparison of Traditional Neural Networks and Radial Basis Function Networks," in Proc. 20th IEEE International Symposium on Industrial Electronics, ISIE2011, Gdansk, Poland, 27-30 June 2011 (Accepted)
Prepared Publications – Chapters for IE Handbook (2nd Edition)
• H. Yu and B. M. Wilamowski, "Levenberg Marquardt Training," Industrial Electronics Handbook, vol. 5 – INTELLIGENT SYSTEMS, 2nd Edition, 2010, chapter 12, pp. 12-1 to 12-16, CRC Press.
• H. Yu and M. Carroll, "Interactive Website Design Using Python Script," Industrial Electronics Handbook, vol. 4 – INDUSTRIAL COMMUNICATION SYSTEMS, 2nd Edition, 2010, chapter 62, pp. 62-1 to 62-8, CRC Press.
• B. M. Wilamowski, H. Yu and N. Cotton, "Neuron by Neuron Algorithm," Industrial Electronics Handbook, vol. 5 – INTELLIGENT SYSTEMS, 2nd Edition, 2010, chapter 13, pp. 13-1 to 13-24, CRC Press.
• T. T. Xie, H. Yu and B. M. Wilamowski, "Neuro-fuzzy System," Industrial Electronics Handbook, vol. 5 – INTELLIGENT SYSTEMS, 2nd Edition, 2010, chapter 20, pp. 20-1 to 20-9, CRC Press.
• B. M. Wilamowski, H. Yu and K. T. Chung, "Parity-N problems as a vehicle to compare efficiency of neural network architectures," Industrial Electronics Handbook, vol. 5 – INTELLIGENT SYSTEMS, 2nd Edition, 2010, chapter 10, pp. 10-1 to 10-8, CRC Press.
Thanks