Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is...

60
Report on the KIDS System: Review and Analysis Zoran Obradovic March 15, 2011

Transcript of Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is...

Page 1: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

Report on the KIDS System: Review and Analysis

Zoran Obradovic

March 15, 2011

Page 2: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

2

1. Executive Summary

I was asked by the Plaintiffs in the federal class action D.G. v. Henry to analyze the

KIDS System, the statewide automated child welfare information system of the Oklahoma

Department of Human Services (“DHS”), and the processes used by DHS to generate reports

from the KIDS System. I am currently the Director of the Center for Information Science

and Technology at Temple University and a Professor of Computer and Information

Sciences at Temple University. I also hold a Ph.D. in computer science, have expertise that

includes data mining and data management and have over 200 publications.

The KIDS System lacks structured tracking and testing processes. These processes

are crucial to any properly functioning data management system. DHS’s failure to

implement or effectively manage these processes has led to serious data quality problems

which have negatively – and severely – impacted the child welfare reports that are based

on data from the KIDS System. There is a significant likelihood that every child welfare

report contains inaccurate, unreliable and/or outdated information. Every child welfare

worker, supervisor and manager who utilizes these reports is potentially utilizing an

erroneous report, thus putting in harm’s way the very children they are responsible for

serving.

The primary problems with the child welfare data management system in use at

DHS are summarized in this section.

1. Management Problems and Organizational Issues. The personnel of the Technology and Governance Unit (“TGU”) have primary responsibility for the KIDS System and child welfare reporting. They are supported by a small group of reports programmers from the Data Services Division (“DSD”), who are “co-located” with TGU. The TGU personnel have little or no background in computer programming and are under-qualified for their positions. The TGU personnel write the queries for a key set of reports – Access reports – despite their lack of qualifications to do so correctly. They also have ultimate responsibility for a second set of reports – WebFOCUS reports – the queries for which are written by the reports programmers. TGU does not monitor who accesses the child welfare reports and allows inaccurate and unused reports to continue to be produced. The former manager of the reports programmers had a very hands-off management style and an inadequate background in the programming languages used by his direct reports. He allowed job partitioning and heavy specialization, both poor management practices. DHS also relies much too heavily on co-location to ensure adequate coordination between TGU and the DSD reports programmers. In practice, these two groups are not collaborating well due to a lack of formal regular communication. These poor management practices and organizational issues enable a number of serious data management problems.

Page 3: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

3

2. Lack of Adequate Change Control. The KIDS System is constantly changing, which affects all of the programs and applications that interact with the KIDS System, including the computer programs that underlie the child welfare reports. Changes to database schemas1 like the KIDS System often require corresponding changes to the applications the database supports. Therefore, it is critical to track and manage database schemas properly. The KIDS System and the child welfare reports require (a) a source code version control system, which records all changes made to the source code and allows for a way to return to previous versions when an error is found; and (b) database change management software, which tracks changes to the database, applies those changes to all affected programs and applications that interact with the database and allows for a way to return to previous versions when an error is found. Instead, revisions to the programming code for the child welfare reports are tracked informally using “comments” written in the code. Further, DHS relies on co-location and informal communication between TGU and DSD to ensure that all relevant changes to the KIDS System are applied to the computer programs underlying the child welfare reports. The processes used by DHS are totally inadequate and lead to a high risk of unreliable and outdated child welfare reports.

3. Lack of Adequate Quality Control. The KIDS System does not have adequate quality control to ensure that the child welfare reports are accurate. DHS does not utilize a standard and formal protocol for evaluating whether software is built according to the specifications (called “verification”) and whether the software is what the end user needs (called “validation”). The protocol should include rigorous testing of the software by the programmers who develop the programs (called “white box testing”) as well as rigorous evaluation by people who are not involved in the software development (called “black box testing”). Instead, DHS relies on face validity testing that assesses, without reference to any defined standards, whether the reports “look like they would work.” These practices are insufficient and lead to a significant risk that the child welfare reports are inaccurate and unreliable.

4. The Child Welfare Reports Are Wrong. Serious errors have already been identified in some of the Access reports. The lack of adequate change control and quality control suggest that these problems are likely more widespread, and that similar errors exist in other Access and WebFOCUS reports. In my opinion, the erroneous reports are numerous and use of these reports by child welfare workers, supervisors and managers is harmful to the children who rely on those workers.

Specific documents and facts that formed the basis for these opinions are discussed in the

following sections.

1 In simple terms, a “database schema” is the layout of a database or the blueprint that outlines the way data is organized. The schema defines the tables, fields, relationships, views, functions, queries and other elements of a database.

Page 4: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

4

2. Qualifications, Scope of Review and Analysis

Children’s Rights hired me to serve as a testifying expert in the D.G. v. Henry

litigation to opine on the KIDS System and the reports generated from the KIDS System. I

am well-qualified to provide an opinion on these subjects. I received a Ph.D. in computer

science from Pennsylvania State University in 1991 and I am currently the Director of the

Center for Information Science and Technology at Temple University in Philadelphia and a

Professor of Computer and Information Sciences at Temple University. My research and

teaching interests include data mining, data management, databases and algorithms. I have

received numerous grants, including grants from the National Institute of Health and the

National Science Foundation. I have published over 200 articles, book chapters and

refereed conference articles on topics ranging from biomedical informatics to data mining

to knowledge systems. I have also been an editor for a number of journals in my field and

have served as a chair, committee member and given lectures at many conferences in my

field. A full description of my qualifications, including a list of my publications, is provided

in Supplement 1. David Schwartz, who has an M.S.W., provided research assistance for this

project. His resume is provided in Supplement 2.

My report is based upon an analysis of the documents, files and deposition

transcripts, listed in Supplement 3, provided to me by Children’s Rights. To prepare this

report, I analyzed the provided materials from December 2, 2010 to January 31, 2011.

During the past four years I have not testified as an expert witness at a trial or by

deposition. My total compensation for the preparation of this expert report, including the

amount paid to Mr. Schwartz, was $42,350. I will be paid $400/hour for preparing for and

attending deposition testimony and $450/hour for preparing for and attending trial

testimony.

3. The Architecture of the KIDS System and the Generation of Child Welfare Reports

The KIDS System is Oklahoma’s statewide automated child welfare information

system.2 It is used by DHS personnel statewide for managing cases and documenting

casework.3 Information about individual cases is input into the KIDS System by DHS

personnel. The KIDS System is updated in real time.4 So, for example, if a worker in

Oklahoma City enters information about a child into the KIDS System, a worker in Tulsa

will immediately be able to see that information by accessing the KIDS System.

2 Grissom (10/1/08), 20. 3 Grissom (10/1/08), 21, 101; Grissom (9/7/10), 54. 4 Grissom (9/7/10), 12.

Page 5: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

5

The data in the KIDS System resides in an Oracle object-relational database.5 The

KIDS System is maintained by two groups of programmers. First, there is a group of

PowerBuilder programmers who are responsible for maintaining and updating the “front

end” user interface of the KIDS System, i.e., the screens that child welfare personnel see

when they utilize the KIDS System. These programmers use the PowerBuilder

programming language.6 Second, there is a group of database administrators who are

responsible for maintaining and updating the “back end” of the KIDS System, i.e., the

underlying database structure. The database administrators use the SQL query language.7

In order to create reports from the data maintained in the KIDS System, the data

must be “extracted.”8 Data extracts are constructed from the Oracle database using

programs developed in the WebFOCUS environment.9 Most of these WebFOCUS extracts

are produced weekly, but some are also produced daily and some on demand.10 Unlike the

KIDS System itself, these data extracts are not updated in real time. Instead, they are

frozen in time at the moment they are created and are only updated when a new data

extract is produced.11

All child welfare reports are generated from these WebFOCUS extracts, either

directly or indirectly. Two different platforms are used for generating child welfare

reports.12 First, some reports are constructed in the WebFOCUS environment by a group of

three DSD WebFOCUS programmers.13 Throughout this report these programmers will be

referred to as the “DSD reports programmers.” The WebFOCUS environment allows the

DSD reports programmers to use a single template for rapid construction of many custom-

built reports by selecting various columns, criteria and output formats. These reports are

called the “WebFOCUS reports.” In addition to using WebFOCUS reports for internal state

reporting,14 DHS also uses WebFOCUS reports to report data to the federal government –

both AFCARS (adoption and foster care data) and NCANDS (child abuse and neglect data).15

The use of the WebFOCUS environment requires knowledge of WebFOCUS

programming, and since the remaining personnel supporting the KIDS System have no

5 Grissom (9/7/10), 9. 6 Nair, 39, 48-51; Jew, 34. 7 Nair, 48, 67-69. 8 Gelona, 27-29. 9 Gelona, 16, 100, 104-106, 172. 10 Gelona, 20-21; Grissom (9/7/10), 30. 11 Grissom (9/7/10), 56, 77-78. 12 Gelona, 20, 24-26. 13 Gelona, 217. 14 Grissom (9/7/10), 36. 15 Gelona, 181.

Page 6: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

6

training in this programming language, a different method is used for the remaining

reports. The DSD reports programmers create three large WebFOCUS extracts – one with

data on permanency planning (YI684), one with data on resources (YI678) and one with

data on staffing (YI701) – and these extracts are converted into tables and loaded into an

Access database.16 Access is a more familiar environment17 and does not require specific

training on WebFOCUS. Members of the reporting group within TGU are responsible for

writing queries in Access. 18 Throughout this report the members of the TGU reports group

will be referred to as the “TGU reports group.” Once written, the Access queries are

executed regularly and reports are generated on a weekly basis.19 These reports are called

the “Access reports.”

Every person who has access to the KIDS System is also given access to all of the

Access reports and the large majority of the WebFOCUS reports.20 Access to these reports

is provided in a variety of ways, including through DHS’s intranet.21

4. Management Problems and Organizational Issues

As discussed above, there are two groups within DHS who are responsible for

generating child welfare reports: the DSD reports programmers and the TGU reports

group. There are significant management and organizational issues with the way these two

groups are operated and the way in which they interact with each other.

The responsibilities of TGU include the help desk, testing, functional design and

requirements gathering for the KIDS System.22 The Programs Administrator of TGU is

Mary Grissom and the Assistant Administrator is Carol Clabo. Five program managers

report to Ms. Grissom: Elizabeth Roberts, who is responsible for federal reporting,

Patricia Frye, who is responsible for state reporting, Vickie Streber, who is the functional

advisor, Stacey Bates, who is responsible for KIDS testing and Kellie Mullen, who is

responsible for KIDS training.23 Ms. Roberts and Ms. Frye are the leaders of the TGU

reports group. No one in TGU has any background in computer science or computer

programming.24

16 Gelona, 94-95, 101-111. 17 Gelona, 158-159. 18 Gelona, 20, 148; Grissom (9/7/10), 44. 19 Grissom (9/7/10), 44; Gelona 110-111, 150. 20 Gelona, 117; Grissom (10/1/08), 27-28. 21 Gelona, 20, 66-67, 115-116. 22 Grissom (9/7/10), 14-15. 23 Roberts, 34-35; Grissom (8/5/10), 6. 24 Grissom (9/7/10), 17-18, 44-45; Roberts, 15-18, 46-47; Jew, 40-42.

Page 7: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

7

The DSD reports programmers are co-located (i.e., physically located in the same

space) as TGU.25 They consider themselves to be service providers to the TGU reports

group;26 ultimate responsibility for the child welfare reports lies with TGU.27

Currently there are three DSD reports programmers, John Gelona (the team

leader), Jin Jew and John Vernon.28 These three reports programmers are responsible for

writing, maintaining and updating the computer programming underlying hundreds of

child welfare reports accessible to approximately 2,000 users of the KIDS System.29 Until

November 2010, the DSD reports programmers were supervised by J.G. Nair.30

Ms. Grissom’s management of TGU is deficient for a number of reasons:

Ms. Grissom completely lacks a background in computer science and computer programming,31 which makes it impossible for her to personally evaluate the Access queries produced by her group or the WebFOCUS reports produced by the DSD reports programmers. For example, in her testimony Ms. Grissom discussed an error discovered by Mr. Gelona that apparently resulted from the use of lower case text rather than upper case text by a member of TGU who wrote a query in Access. Because of her lack of experience with Access, she did not know whether Access required upper case, let alone have the ability to check this query herself.32

Ms. Grissom’s lack of experience is compounded by the fact that no one within TGU has any experience in computer science or computer programming. For example, Ms. Roberts, who is responsible for all federal reporting and adoption and post-adoption reporting,33 has had no training in computer programming,34 which is needed to develop and test the queries used to create child welfare reports. Simply put, the “Technology and Governance Unit” has no one with a background in technology.

Ms. Grissom also does not pay sufficient attention to important details related to her job responsibilities. One example is the fact that Ms. Grissom incorrectly believed that the Access databases “only have available the information that is current for that week,”35 only to learn through the litigation process that in fact DSD

25 Roberts, 43. 26 Nair, 18. 27 Roberts, 61; Jew, 80; Gelona, 62. 28 Gelona, 61; Jew, 16-17. 29 Gelona, 61. 30 Nair, 20. 31 Grissom (9/7/10), 17-18, 108. 32 Grissom (9/7/10), 70. 33 Roberts, 20. 34 Roberts, 60, 63. 35 Grissom (8/5/10), 41.

Page 8: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

8

can restore data for any period.36 It is difficult to believe that DHS would put significant effort into long-term data backup procedures while the head of TGU, a group that benefits from such backup data, would remain unaware of this important service. A more likely explanation is a lack of attention to detail by Ms. Grissom. Another example is that Ms. Grissom is aware that changes to the KIDS System require changes to the Access queries, but she did not implement any policies or procedures for checking if changes to the KIDS System were incorporated into the queries. Instead, she expected her group members to manage that process without monitoring the results.37

TGU does not track or monitor whether DHS employees access the child welfare reports, let alone who accesses those reports or how frequently they do so.38 Furthermore, certain reports continue to be produced that have not been used for years and contain significant errors.39 As a specific example, the YI624 report has not been updated in years, so the computer code ignores many tables that are currently available and that were not there when the code for this report was developed.40 This report – which is still available to every child welfare worker, supervisor and manager in the state – incorrectly shows about 7,000 children with no placement. Providing access to such inaccurate reports without tracking if anybody is using the data is a dangerous practice that should be stopped.

Mr. Nair’s management of the DSD reports programmers was deficient for a number of reasons:

Mr. Nair did not have adequate experience with Access and WebFOCUS to manage programmers who utilized WebFOCUS exclusively and who created extracts whose sole purpose was the creation of Access queries.41 Furthermore, he did not conduct meetings with the DSD reports programmers to review the programs they wrote to ensure that they met any standards and to allow for easier software integration.42

Mr. Nair focused on managing the DSD reports programmers’ time, i.e., ensuring that they had sufficient time to fulfill their tasks. 43 He did not focus enough on task-specific management, i.e., ensuring that those tasks met the necessary standards.

Mr. Nair had poor oversight of his team and the software development and data management process. Job partitioning among the DSD reports programmers in his

36 Mr. Gelona thought that backups are available only up to five weeks due to the backup tape rotation process; this is also incorrect (Gelona, 112). Like Ms. Grissom, Mr. Gelona displays a disturbing lack of attention to important details. 37 Grissom (9/7/10), 116-117. 38 Grissom (9/7/10), 65, 85, 90-91; Gelona, 71, 144-145, 152. 39 Gelona, 31. 40 Gelona, 31-32, 135. 41 Nair, 102, 121. 42 Nair, 116-118; Jew, 34-35, 92. 43 Nair, 13-14.

Page 9: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

9

group was such that the programmers’ tasks were too specialized and they were not required to rotate positions in order to become familiar with the functions carried out by other team members. It was evident from Mr. Jew’s deposition that he is not aware of many tasks that Mr. Gelona performs.44 This is probably due to a lack of rotations in this group and suggests a poor management practice. A rotation practice in the software engineering community is aimed at ensuring that system development and maintenance is not overly dependent on a single programmer. Here, it appears that if Mr. Gelona leaves, Mr. Jew would not be prepared to replace him because his current tasks are overly specialized. As a result, there could be significant problems in the future if Mr. Gelona ever leaves DHS. When Ms. Grissom was asked, “What would happen if Mr. Gelona left tomorrow?” her answer was, “We’d all have to learn a lot.”45 In my opinion, that is a serious underestimation. In practice, it would take a long time to learn such a complex set of tasks and this would be even more challenging given that no one at TGU has any training in the required computer programming skills.

Furthermore, Ms. Grissom, Mr. Nair and the personnel who report to them rely

completely on co-location to ensure adequate communication between the TGU reports

group and the DSD reports programmers.46 While co-location is potentially useful, it is not

in any way an adequate substitute for formal processes to monitor data quality and to

manage changes in related applications and outputs, including the child welfare reports,

when the underlying KIDS System is changed. There is a total lack of routine and formal

daily communication between the TGU reports group and the DSD reports programmers.47

Finally, Ms. Grissom, Mr. Nair and the personnel who report to them have allowed

ineffective change control and quality control practices to be implemented. These concerns

are discussed in detail below and are typically addressed by insisting on appropriate

software engineering practice and close formal interactions with members of applications

groups that depend on the programming team’s output.

Overall, neither Ms. Grissom nor Mr. Nair provided effective oversight of their

respective groups. Lax management practices and poor organization have enabled the

inadequate data management practices discussed below.

5. Lack of Adequate Change Control

Software code, including the software code that underlies the KIDS System, is

constantly changing as the computer program is modified and updated. It is necessary to

have the proper process in place to manage these changes in order to ensure that those

44 Jew, 28-29, 36-38, 57-58, 92, 95. 45 Grissom (9/7/10), 74. 46 Nair at 86, 116-118; Grissom (9/7/10), at 14, 34-35. 47 Grissom (9/7/10), at 34-35, 46-47, 109-110, 118-119; Nair at 91-92, 116.

Page 10: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

10

changes trickle down to, and are properly implemented in, all programs and applications

that interact with the software. The KIDS System is no exception because it interacts with

other programs and applications, including those that are used to create the child welfare

reports. DHS has failed to implement adequate change control systems. As a result, there

is a high risk that the data maintained in the KIDS System, and in the applications that

interact with the KIDS System, is unreliable.

a. Failure to Utilize a Source Code Version Control System

The KIDS WebFOCUS extracts and the Access queries appear to lack a “source code

version control” system. Source code version control systems have a number of functions,

including: (1) recording the changes made to software source code and storing every

version of the source code; (2) allowing a line-by-line comparison of any two versions of

the software source code; and (3) providing rollback support, which is an effective way of

returning to any previously-tested and committed version of the software48 when errors

are found in the current operational version. Essentially, source code version control

systems function for software much as version control systems function for word-

processed documents: users can save multiple versions, compare them to each other and

revert to old versions if necessary.

The WebFOCUS extracts and the Access queries need a tool that records the history

of changes made to their software source code and rollback support that allows for

reversion to previous versions of the source code when errors emerge. Errors are

unavoidable; however, it is necessary to have a system in place for addressing them by

returning to a version of the program that predates the error. Source code version control

systems have long been readily available and are standard software engineering practice.

For example, CA Software Change Manager (previously called Harvest) was developed in

the early 1970s and more recent products include Concurrent Versions System (called

CVS), Subversion (called SVN), and Global Information Tracker (called Git).

Instead of using a source code version control system, software revisions in the

WebFOCUS extracts are tracked informally based on comments written in the source code

by the computer programmers, accompanied by their initials to identify who made the

change.49

48 Committing is a “check-in” process that consists of submitting to the source code version control system as a bundle a set of changed files along with a description of the specific changes and the evaluation process. This is aimed at ensuring software quality and tractability of changes. 49 Gelona, 37-38; Jew, 49, 88-89.

Page 11: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

11

A similarly informal system was used for the Access reports. I examined this more

closely by analyzing the system used to create the Access reports. I began by examining the

names and descriptions of the queries used to create these reports.50 These documents

only provide brief descriptions of the queries rather than their detailed structure and a

history of all changes. I was able to obtain a more detailed view of some queries by running

Microsoft Access and reviewing the SQL code underlying these queries. The notes

associated with these queries do not provide much additional information other than the

date of creation and the date of last modification of a query. For example, according to the

time stamps in the database, a query for “Count of Children by Age Group” was created on

July 26, 2002 and was last modified on August 28, 2009. No information was provided

about what modification was made on August 28, 2009 or whether there were additional

modifications in the seven preceding years. Similarly, the creation time stamp for the

query “Count of Children by Age Graphed” is November 7, 2007 and the last modification

was on February 11, 2009. If DHS utilized a source code version control system, it would

be easy to determine the differences between these two versions of the query as well as to

find out how often and to what extent the query was modified.

The KIDS System evolves over time. When an error is found in the KIDS System it is

risky and time-consuming to make immediate additional changes to the computer program

as any modification could result in further errors unless it is rigorously tested before

deployment. In the meantime, a source code version control system allows a quick revert

to a known reliable version. It also allows the administrators to track the changes and

determine what caused the error.

b. Failure to Utilize Database Change Management Software

The KIDS System also appears to lack “database change management software.”

This software is necessary when a database interacts with other computer programs and

applications because it allows you to (1) track any changes made to the database; (2)

ensure that those changes are properly applied to all programs that interact with the

database; and (3) return to any previous state of the database by restoring all tables, fields,

relationships, views, functions, queries and other elements by using a rollback function.

One example of a database change management tool which allows for rollback is Liquibase.

At his deposition, Mr. Nair described the only change management technology in use

at DHS, a program called Remedy. The Remedy program tracks changes at the level of a

KIDS System release, not at the level of specific enhancements to the KIDS System. So, the

Remedy program cannot be used for testing, data quality control or auditing of changes to

50 E.g., YI684 Queries with Descriptions (Data C&R-11-00001-00044); YI678 Queries with Descriptions (Data C&R-9-00001-00015).

Page 12: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

12

the KIDS System. Mr. Nair testified further than Remedy only works at the level of “system

changes” and does not “track to [the] level” of specific changes to a WebFOCUS or

PowerBuilder program.51 Database change management software should track all changes

that are made to the database that will affect any software that depends on the database,

including all software used to generate reports from the database.

My analysis of the “version notes” that accompany the periodic (usually monthly or

bimonthly) releases of new versions of the KIDS System52 shows that the KIDS System

regularly undergoes many structural changes.53 Database change management software

should have been utilized to ensure that these changes were properly applied to all

programs and applications that interact with the KIDS System, including those used to

create the data extracts that underlie the WebFOCUS and Access reports. Database change

management software would also have allowed DHS to return to any previous state of the

database if an error was identified. DHS’s failure to utilize appropriate change

management techniques almost certainly results in errors in programs and applications

that rely on the KIDS System for input; without a way of returning to a functional version of

the database, these errors will be difficult to correct.

Instead of utilizing database change management software, DHS relies entirely on

co-location, and the resulting informal communication between members of TGU and the

reports programmers, to ensure that every relevant change made to the KIDS System is

communicated to the DSD reports programmers. The testimony amply demonstrates

DHS’s complete reliance on co-location and informal communication to ensure that

changes to the KIDS System are properly implemented in the programming code for the

Access and WebFOCUS reports. For example:

From the deposition of Mr. Nair, it is clear that he relies very heavily on the concept of co-location to ensure that the Access and WebFOCUS reports are accurate.54 Mr. Nair’s reliance on co-location is alarming given his computer science background because it is not a sound practice solution for change management. Furthermore, Mr. Nair did not “think there’s one single person that has the responsibility of making sure that any change we make on KIDS . . . what change it will have on the reports. We don’t have one person responsible for that.”55 He was not aware of any systematic way in which change control is effected or formal meetings where change control is discussed.56 This testimony was very surprising because the

51 Nair, 37-40, 72-75, 92. 52 Nair, 38, 90. 53 2004 Version Notes; 2005 Version Notes; 2006 Version Notes; 2007 Version Notes; 2008 Version Notes; January 2008 Version Notes; 2009 Version Notes. 54 Nair, 81, 86, 89-90, 92-93. 55 Nair, 87. 56 Nair, 87-89, 117-118.

Page 13: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

13

version notes provide strong evidence that changes in the KIDS System included many data schema modifications that could easily impact the reports.

Mr. Gelona and his team of DSD reports programmers are not keeping track of changes made to the KIDS System and he does not regularly review the KIDS System version notes, or receive a summary of those version notes from anyone, in order to determine whether there have been any important changes to the KIDS System.57 Instead, he relies completely on TGU to tell him when a change occurs in the KIDS System that would require him to modify his programming code.58 Mr. Gelona insisted that his YI684 extract file, which is used to create the YI684 database and consists of 8,000 lines of computer programming code, does not require any regular updating or maintenance, no matter what changes are made to the KIDS System.59 Further, Mr. Gelona operates without source code version control management software or database change management software that would help to ensure that his extracts and his reports are accurate and to allow easy rollback to any previous version if needed.

Mr. Jew also fails to utilize source code version control and data change management tools. He does not even keep a record of which WebFOCUS reports he writes the source code for.60 Like Mr. Gelona, he relies entirely on comments he writes in the source code to record the changes he makes,61 but this approach does not provide a sufficient record of changes or rollback options to previous configurations. Furthermore, the only way Mr. Jew learns about changes to the KIDS System is from the TGU reports group, either directly or indirectly. He does not regularly receive or review the version notes to the KIDS application.62

Ms. Grissom relies on the concept of co-location, and places a great deal of trust in the DSD reports programmers,63 but does not have the computer programming background to evaluate whether they are utilizing best practices.64 Ms. Grissom did acknowledge that whenever a change is made to the KIDS System, it is necessary to make changes to affected Access and WebFOCUS queries but testified that TGU has no formal policy in place to ensure that this is done.65

Ms. Roberts is not familiar with sound software engineering methods for source code version control or database change management. In fact, Ms. Roberts does not regularly review the version notes to identify changes, errors and/or bugs that

57 Gelona, 128-130, 178-180, 233. 58 Gelona, 123-127, 130-132. 59 Gelona, 143. 60 Jew, 59. 61 Jew, 49, 88-89. 62 Jew, 89-95. 63 Grissom (9/7/10), 52, 117-119. 64 Grissom (9/7/10), 17. 65 Grissom (9/7/10), 110-111, 115-119.

Page 14: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

14

might impact the reports for which she is responsible.66 She only becomes aware of changes that might affect reports through verbal discussions with Ms. Clabo or Ms. Streber, two other members of TGU. Ms. Roberts is responsible for discussing changes to the KIDS System with the DSD reports programmers if a change affects a report.67

The practices outlined above are totally insufficient, as evidenced by the recently-

discovered problems with the Access reports uncovered by Mr. Gelona (discussed in detail

below). It is unlikely, however, that these problems are limited to the Access reports. The

reliance on co-location and informal communication from non-programmers to ensure

change control has the potential to adversely affect every child welfare report generated by

DSD and TGU.

To summarize, if a change is made to the KIDS System, and that change affects the

programming code underlying a child welfare report, the process currently in place does

not ensure that the programming code will be updated. Thus, the child welfare report will

no longer be reliable or up-to-date. Because of the frequency with which changes are made

to the KIDS System, it is likely that this problem is widespread and that many child welfare

reports are adversely affected. No user of a child welfare report can rely on that report

being accurate or up-to-date.

6. Lack of Adequate Quality Control

In addition to a lack of adequate change control, the KIDS System suffers from a lack

of quality control. This lack of quality control affects not only the KIDS System itself, but

also the child welfare reports that are based on information stored in the KIDS System.

Neither the DSD reports programmers nor the TGU reports group sufficiently tests the

child welfare reports to ensure that they are accurate before they are made available to

child welfare workers, supervisors and managers.

Turning first to the KIDS System, the version notes that I reviewed list numerous

functionality errors that went undetected during the KIDS System software development

and testing process. A list of some specific examples is provided in Supplement 4. The

presence of so many errors suggests insufficiently rigorous software testing protocols. In

addition, I was surprised that Mr. Nair, given his position, was not sure whether the testing

done by contractors from Oklahoma University is only done for major enhancements or for

every modification to the KIDS System.68 This suggests that feedback to his group at DSD

from these testers was missing or that management of the programmers and/or testers

was poor.

66 Roberts, 56, 83 67 Roberts, 54-60, 83-87, 102-103, 112-113. 68 Nair, 79.

Page 15: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

15

Furthermore, Ms. Grissom is responsible for initiating changes to the KIDS System,69

yet she has no formal or informal programming background. She claims that any errors in

the child welfare reports are limited to the Access queries, while the KIDS System itself is

fine.70 Her confidence in the KIDS System is misplaced. For example, she suspects that a

“documentation error” is the reason why only one child is listed in a report as being in a

DHS-operated facility.71 Assuming Ms. Grissom is correct, this suggests that there are

insufficient quality control procedures at the data entry stage such as poor training of

workers and/or inadequate data entry testing procedures. Furthermore, the fact that Ms.

Grissom’s team has not found such an obvious error is another serious concern as this kind

of outlier in a dataset is very easy to spot by simple statistical techniques.

This same lack of quality control permeates the process for evaluating the software

used to create the child welfare reports. Whenever programmers develop a piece of

programming code, it is standard practice to perform both “verification” and “validation.”

Verification is the process of evaluating software to determine whether it has been

designed in accordance with the specifications. Validation is the process of evaluating

software to determine if those specifications were correct in the first place, i.e., whether the

software actually meets the users’ needs. In other words, verification ensures that you

built it right, while validation ensures that you built the right thing. The software testing

protocol should include “white box testing,” or rigorous evaluation of the internal

structures of the software by the programmers who developed it, which can uncover many

errors in individual units of source code (e.g., control and data flow errors and branching

errors in the implementation of an algorithm). The software testing protocol should also

include “black box testing,” or rigorous evaluation of the software by people who were not

involved in its development by selecting valid and invalid inputs to determine if the

functional requirements are satisfied. At least for the KIDS System, DHS appears to utilize

white box testing by technical staff at DSD.72

However, DHS does not appear to follow any of these standard practices for the

child welfare reports. Instead, the DSD reports programmers and the TGU reports group

utilize “face validity” testing only, which assesses, without reference to any defined

standards, whether the reports “look like they work” without rigorous evaluation of

whether that is the case. The child welfare reports do not undergo any standardized

testing protocol before being released to child welfare workers, supervisors and

managers.73 This is completely inadequate and leads to a high likelihood that the computer

69 Grissom (10/1/08), at 155; Gelona at 62-63. 70 Grissom (9/7/10), 59. 71 Grissom (8/5/10), 23. 72 Nair, 76. 73 Grissom (9/7/10), 27-28.

Page 16: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

16

code used to create the child welfare reports – and the reports themselves – contains

errors. Specifically:

According to Ms. Grissom, she and her team look at the outcome on a report, compare it to other similar reports and do a “number of different kinds of testing.”74 This is problematic because if they overlook an error by an informal comparison to some of the prior reports, then that error will become more difficult to identify later. A sound testing procedure is needed to minimize the possibility that errors stay undetected for long periods of time. Ms. Grissom believes that the known problems with the Access reports (discussed below) would not have happened if the Access queries were “reviewed by the person who created [them] to make sure that the logic was good.”75 This is a naïve view of the software testing process. Instead, Ms. Grissom, as the Programs Administrator of TGU, should have enforced rigorous testing procedures, including internal and external testing and regular updates of queries by using the test protocols any time that a data schema affecting the query was modified. All versions of modified queries should have been committed and queries should have been evaluated over time. Ms. Grissom has not put into place any rigorous testing or change management protocols to ensure data quality, leading to a high risk that every child welfare report contains errors.

Ms. Roberts utilizes face validity testing to ensure that the reports she is responsible for are accurate. Basically, her quality control measures consist of nothing more than relying on the fact that she, a member of the TGU reports group or a member of the child welfare field staff will notice an error in the report itself.76 In an attempt to identify errors, she looks at the data in the reports and compares it to the information in the KIDS System. However, when a report contains aggregate data, “I don’t know that I have anything specifically to check [the report] against;” instead, all she is able to look at is “what has the data been telling you over time and . . . is this a reasonable fluctuation or steady line and is anybody questioning you about it.”77 Ms. Roberts’ objective seems to be to identify only those problems that “stick[] out like a sore thumb.”78 For the federal reports, Ms. Roberts also uses a compliance utility, one of the two utilities that were given to her by the Children’s Bureau, to help ensure data consistency and quality, but this utility is only used to identify obvious problems.79 Ms. Roberts stated that the other utility provided by the Children’s Bureau, the data quality utility, is not used very often, and that she is not certain how the utility functions.80

Mr. Nair testified that there was no systematic testing of any child welfare reports,

74 Grissom (9/7/10), 27-28, 38-39. 75 Grissom (9/7/10), 49. 76 Roberts, 50-51. 77 Roberts, 100-102. 78 Roberts, 51. 79 Roberts, 67-70. 80 Roberts, 70-71.

Page 17: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

17

aside from the use of utilities provided by the federal government to test the AFCARS and NCANDS data. He was not aware of any systematic way in which WebFOCUS reports are tested after they have been put into production to ensure that they are still generating valid data.81 Mr. Nair had more confidence in the federal reports because of the utilities provided by the Children’s Bureau.82 This confidence, however, is misplaced. Mr. Nair seemed unaware of the fact that according to Ms. Roberts, the compliance utility is only used to identify obvious problems and she rarely uses, and does not fully understand how to use, the data quality utility.

Mr. Gelona does not perform any rigorous white box software testing on the programming code he writes for the child welfare reports. Instead, he relies on the TGU reports group and end users – none of whom have computer programming experience – to identify problems with the reports and, for the federal reports, relies on the Children’s Bureau utilities, though, as stated above, the data quality utility is infrequently used.83 Furthermore, contrary to accepted practice, Mr. Gelona concludes that his extracts from the KIDS System are accurate as long as the users of the reports that are generated do not tell him there is a problem with the report.84 He makes changes to his extracts only if asked to add a new field or if somebody tells him to make a change based on changes to the KIDS System.85 These practices are highly unsound and problematic. Given the lack of personnel with programming backgrounds in TGU and the limited scope of the Children’s Bureau utilities, problems with the data can easily go undetected.

According to Mr. Gelona, he has not written any of the queries used to create the Access reports.86 Although he prepares the data extracts used to write these queries and is co-located with the TGU reports group who write these queries, it appears that Mr. Gelona is not collaborating very closely with TGU. In particular, he testified that he never checked the Access queries until the last six months when he was asked to assist with running the YI684 queries for the quarters ending March 2007 to March 2010 (for the purposes of this litigation).87 It seems as though there was no ongoing quality control with respect to the Access queries. It is very poor practice that he, or someone else from DSD, was not asked to comprehensively check the Access queries considering that the people who write and manage these queries do not possess strong query writing or programming skills and that their quality control testing relies heavily on comparing data in a very short time period (according to Ms. Grissom’s deposition, for the reports that are produced on an

81 Nair, 125-126. 82 Nair, 104. 83 Gelona, 74-75, 131-134, 207-208. 84 Gelona, 131-133. 85 Gelona, 123. 86 Gelona, 12. 87 Gelona, 138-139, 149-150, 240.

Page 18: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

18

ongoing basis they have only two hours per week to do so).88

Most of the issues raised by Mr. Gelona’s testimony are also applicable to Mr. Jew’s testimony. Mr. Jew verifies his programming code in a completely informal way by manual spot checks and he only checks his programming code if someone reports an error to him.89 Mr. Jew acknowledged that it is not always possible to check a report against a screen in the KIDS System, if, for example, the report contains aggregate information.90 For federal reports, Mr. Jew manually spot checks some of the information in the AFCARS file against the information in the KIDS System and occasionally, but “[n]ot often,” uses the Children’s Bureau utilities.91 In his view, ultimate responsibility for verifying the reports belongs to the TGU reports group.92 These are not sound software testing practices.

The face validity testing and Children’s Bureau utility (which is used only for federal

reporting) are not a substitute for rigorous quality assurance and testing practices within

the agency. These inadequate quality control procedures equally affect the Access and

WebFOCUS reports. The Access reports, however, also suffer from the fact that the queries

themselves are written by poorly trained, non-programmers with little oversight from the

computer professionals in DSD. In my opinion, there is a significant risk that every child

welfare report contains inaccurate data because of these quality control issues.

7. The Child Welfare Reports Are Wrong

It is unquestionably important that the information contained in the reports used by

the child welfare workers, managers and supervisors in Oklahoma be accurate, complete

and up-to-date. Indeed, Ms. Grissom, Ms. Roberts and Mr. Gelona all rightly testified that

this was true.93 Furthermore, it is clear from testimony that the Access and WebFOCUS

reports are actually being used by child welfare workers, supervisors and managers. 94

Unfortunately, DHS lacks auditing capabilities that would allow for anyone to track the use

of these reports in a precise and detailed way.95

DHS recently discovered serious problems with numerous Access reports. Although

the DSD reports programmers do not routinely check or monitor the Access queries,96

during the summer of 2010, Mr. Gelona was asked to look into these queries for reasons

88 Grissom (9/7/10), 122. 89 Jew, 71, 83-84, 87. 90 Jew, 70-73. 91 Jew, 83-85, 87-88. 92 Jew, 69-76. 93 Grissom (9/7/10), 230-231; Gelona, 230-232; Roberts, 78-79, 100, 112, 114-115. 94 Grissom (9/7/10), 78-80, 87, 89-93; Roberts, 25, 28-29, 96-100, 105, 111-112. 95 Grissom (9/7/10), 65, 85, 90-91; Gelona, 71, 144-145, 152. 96 Gelona, 240.

Page 19: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

19

related to this litigation.97 When he did, he found numerous problems with the YI684

Access queries and documented those problems in a report titled “Problems with YI684

queries for Litigation Discovery.”98 Mr. Gelona’s report discusses general problems that

affect all of the YI684 queries, including (1) flaws in the way counties and areas are

determined and (2) inconsistencies in area and county name labels. The report also

describes in detail problems with 60 specific queries. Mr. Gelona testified that some of

these problems will “have a huge effect on the queries” and that between 50 and 70 percent

of the YI684 queries he reviewed were affected by the problems he discovered.99 Ms.

Grissom and Mr. Nair understood that these problems likely infected all of the Access

reports and possibly the WebFOCUS reports as well.100 All of the DSD and TGU personnel

who were asked about these issues, including Ms. Grissom, Mr. Nair and Mr. Gelona,

expressed serious concerns about these problems.101

Mr. Gelona attributed these problems to the fact that “that’s what happens when you

have non-professional programmers write programs . . . [T]hey don’t fully understand all of

the data they have got or . . . all of the relationships between the data.”102 This is a serious

over-simplification of the reasons for these errors. While it is true that this is one reason

for the problems, and the WebFOCUS programming code was probably written more

skillfully because professional programmers were in charge instead of the non-

programmers at TGU, the lack of change control and quality control described above make

errors in all of the reports highly likely.

I tried to analyze the reasons behind the specific errors identified in Mr. Gelona’s

report. Given the limited documentation of revisions made to the source code (described

above) it was impossible to fully understand the process that led to the errors (e.g.,

whether the errors were due to structural changes to the KIDS System, structural changes

to the WebFOCUS extract or other reasons). In order to fully undertake this analysis, it

would be necessary for DHS to utilize a source code version control tool that allows for

rollback to undo changes in the database (or at least retrieval of queries and data tables by

specific dates). Despite the lack of this tool, I was able to identify a number of queries that

appear to have been affected by changes to the KIDS System that are listed in the version

notes; these changes were not properly implemented in the Access queries. Contrary to

97 Grissom (9/7/10), 58-59, 149-150. 98 Deposition Exhibit 319. 99 Gelona, 151-152. 100 Grissom (9/7/10), 94-97; Grissom (8/5/10), 50, 53; Nair, 101-104. 101 Gelona, 151-153; Grissom (8/5/10), 50, 53; Grissom (9/7/10), 62, 230-231; Nair, 95-97. Ms. Frye also expressed concern about these problems in an email to Ms. Grissom (Issuesw-AccessComm-00001). 102 Gelona, 137-138.

Page 20: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

20

Mr. Gelona’s assertion, this is not simply an issue of non-professional programmers; it is a

fundamental problem of a failure to implement standard database management practices.

In addition to outright errors, there is also the misuse of titles and terminology in

the child welfare reports. For example, Ms. Grissom testified that a report titled “Count of

Children in Foster Care by Area” is mislabeled because “the title does not reflect what’s in

the report.”103 Furthermore, multiple definitions were used for the same term in the

reports. One example is the term “family foster care,” as discussed by Ms. Grissom.104

Mislabeling reports and using the same term in different ways in different reports is

confusing and bad practice. Users of these reports could easily make mistakes and misuse

the reports as a result of these practices.

Fundamentally, the poor database and software management practices described

above make it likely that other Access and WebFOCUS reports beyond those identified by

Mr. Gelona contain errors. It is my opinion that the errors in these reports are numerous

and it is highly likely that many of the reports are inaccurate. In my opinion, it is harmful to

the children in DHS custody for DHS to continue to use those reports because they cannot

be relied upon to provide complete, accurate or up-to-date information. While the overall

damage to end users cannot be completely assessed because of the lack of auditing

capabilities, the widespread use of the reports makes it highly likely that child welfare

workers, supervisors and managers who are directly responsible for the welfare of

children in DHS custody are utilizing reports that are not reliable, accurate, complete or up-

to-date.

__________________________________

Zoran Obradovic, Ph.D.

March 15, 2011

103 Grissom (8/5/10), 55. 104 Grissom (8/5/10), 69.

Page 21: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

Supplement 1

Page 22: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

ZORAN OBRADOVIC (last update: Jan. 16, 2011)

ADDRESS: Center for Information Science and Technology,

Temple University, 303 Wachman Hall (038-24) 1805 N. Broad St., Philadelphia, PA 19122, USA Phone (215) 204-6265, FAX: (215) 204-5082 E-mail: [email protected], WWW: http://www.ist.temple.edu/~zoran

RESEARCH INTERESTS:

Data Mining; Machine Learning; Spatial and Temporal Data Management; Bioinformatics. TEACHING INTERESTS:

Data Mining; Bioinformatics; Machine Learning; Databases; Data Warehousing; Algorithms; Time Series Analysis, Geographical Information Systems, Pattern Recognition; Neural Networks; Intelligent Data Analysis; Artificial Intelligence; Parallel and Distributed Computing.

EDUCATION:

Ph.D. in Computer Science, The Pennsylvania State University, May 1991. Dissertation: “Discrete Multi-Valued Neural Networks." M.S. in Mathematics and Computer Science, University of Belgrade, June 1987. Thesis: “High-Speed Parallel Computing." B.S. in Applied Mathematics, Computer and Information Sciences., Univ. of Belgrade, Dec. 1985.

PROFESSIONAL EXPERIENCE:

Director 2000 - Present. Center for Information Science and Technology, Temple University, Philadelphia, PA. Professor (tenured) 2000 - Present. Computer and Information Sciences Department, Temple University, Philadelphia, PA. Associate Director 2003 - 2004. Center for Quantitative Biology and Biomedical Mathematics, Temple University, Philadelphia, PA. Associate Professor (tenured) 1997 - 2000. (Assistant Professor, 1991 - 1997.) School of Electrical Engineering and Computer Science, Washington State Univ., Pullman, WA. Guest Scientist, Fall-Winter 1998. Information and Communications Department, Corporate Research and Development, Siemens AG, Munich, Germany. Adjunct Research Professor, 1999 - Present. (Adjunct Scientist, 1986 - 1999) The Mathematical Institute, Academy of Sciences and Arts, Belgrade, Serbia.

GRANT SUPPORT: Funded Projects: • Shi, J., Obradovic, Z. (Jan. 2011 – Aug. 2011) “Integrated Data Warehouse,” City of Philadelphia, Project

21100816140717, $135,064. • Unterwald, E.M. et al (July 2010 – June 2015) “Center on Intersystem Regulation by Drugs of Abuse,”

Database and Drug Interaction Core (with Tallarida, R.), National Institute of Health, Grant 2P30DA013429, $812,301 direct cost per year.

• Wu, J., Bishwas, S.K., Bai, L., Criner, G.G., Galvinski, E.T., Klein, M.L, Kohlweyer, A, Kwatny, G., Obradovic, Z., Rivin, I., Shi, Y., (May 2010 – April 2013) “MRI-R2: Aquisition: A Hybrid High-Performance GPU/CPU System,” National Science Foundation, NSF-CNS-0958854, $839,221.

• Kelsen, S., Merali, S., Obradovic, Z. (Sept. 2009 – Aug. 2011) “Ancillary Study: Identification of Plasma Biomarkers in Chronic Obstructive Pulmonary Disease,” National Institute of Health, Grant 1RC2HL101713-01, $784,389.

Page 23: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

• Obradovic, Z. (Oct. 2008 – Dec. 2011) “Improving Biomedical Informatics Support at Temple Health Sciences Center,” The Pennsylvania Department of Health, $300,000 (direct costs).

• Dunker, A.K., and Obradovic, Z. (June 2008 – May 2010) “Bioinformatics Linkage of Protein Disorder and Function,” National Institute of Health, Grant R56 LM007688-05A1 $441,508.

• Obradovic, Z., Vucetic, S. and Z. Li (Aug. 2006 – July 2011) “Collaborative Research: Data Mining Support for Retrieval and Analysis of Geophysical Parameters,” National Science Foundation, NSF-IIS-0612149, $600,404 ($400,207 to Temple University).

• Harris A., Obradovic, Z., Izenman, A., Mennis, J. (Sept. 2006 – Sept. 2009) “Investigating Simultaneous Effects of Individual, Program and Neighborhood Attributes on Juvenile Recidivism Using GIS and Spatial Data Mining,” National Institute of Justice, GMS Award 2006-IJ-CX-0022, $ 316,714.

• Soprano, D.R., Soprano, K.J., Obradovic, Z. and Vucetic, S. (April. 2005 – Dec. 2009) “PBX and Retinoic Acid-Dependent Differentiation,” National Institute of Health, NIH- 1 R01 DK070650-01, $1,586,250.

• Obradovic, Z. and Vucetic, S. (June 2004 – April 2008) “Applications of Bioinformatics Data Analysis to Cardiovascular and Cancer Research,” The Pennsylvania Department of Health, $250,000 (direct costs)

• Megalooikonomou, V., Obradovic, Z., Boyko, O.B., Gee, J. (January 2004 – December 2007) “Large Scale Data Analysis for Brain Images,” National Institute of Health, Grant NGA: 1 R01 MHO68066-01A1, $1,284,246.

• Dunker, A.K., and Obradovic, Z. (Sept. 2003 – Sept. 2007) “Bioinformatics Linkage of Protein Disorder and Function,” National Institute of Health, Grant R01 LM007688-01A1, $1,291,356.

• Harris A., Obradovic, Z., Izenman, A., Mennis, J. (July 2006 – Dec. 2006) “Investigating the Simultaneous Effects of Individual, Program and Neighborhood Attributes on Juvenile Recidivism Using GIS and Spatial Data Mining,” Institute of Public Affairs, Temple University, $16,320.

• Obradovic, Z. and Vucetic, S., (August 2002 - July 2006) “ITR/Small/Scientific Frontiers: Task-Specific Data Reduction and Mining in Spatial-Temporal Domains," National Science Foundation, Grant 0219736, $210,120.

• Obradovic, Z. and Vucetic, S., (June 2004 – Aug. 2004) “REU Supplement for ITR: Task-Specific Data Reduction and Mining for Spatial-Temporal Domains," National Science Foundation, $12,000.

• Kwatny, E., Stafford, R., Megalooikonomou, V. and Obradovic, Z., (Sept. 2001 - Sept. 2004) High Performance Network Connection for Knowledge Discovery Research," National Science Foundation, Grant NSF-ANIR-0124390, $353,100 ($ 150,000 from NSF).

• Obradovic, Z. and Vucetic, S. (January 2004 – June 2004) “Research Infrastructure and Expertise for Gene Expression Data Analysis,” The Pennsylvania Department of Health, $70,000 (direct costs).

• Obradovic, Z., Chang, F.N., Tuszynski, G. P. and Vucetic, S. (January 2004 – June 2004) “Mining High Performance Liquid Electrophoresis Data,” Temple University, $8,000 (direct costs).

• Wolfgang, P., Obradovic, Z., Megalooikonomou, V. and Vucetic, S., (June 2003 – December 2003) “Visualization and Analysis of Commercial Flight Data,” Lockheed Martin Corp., $49,000

• Obradovic, Z. (January 2003 – August 2004) “An Efficient System for Discovering Patterns and Associations at Earth Observation Databases,” New Previously Unfunded Directions for Established Investigators Grant Application, Temple University, $30,000.

• Obradovic, Z. (March 2001 - September 2001) “Data Reduction for Spatial-Temporal Knowledge Discovery," Idaho National Engineering and Environmental Laboratory, LDRD Program under DOE contract DE-AC07-99ID13727, $50,000.

• Dunker, A.K and Obradovic, Z., (May 2000 - May 2003) “Bioinformatics, Disordered Proteins and Function," The National Institute of Health, Grant 1 R01 LM06916-01, Biotechnology, $984,026

• Obradovic, Z. and Tomsovic, K., (August 2000 - August 2004) “Towards an Understanding of Deregulated Electricity Markets through Time Series Analysis," Power Systems and Intelligent Systems Programs, Division of Engineering, National Science Foundation, Grant ECS-9988626, $240,000.

• Obradovic, Z. and Dunker, A.K., (June 1998 - December 2001) “Intelligent Data Analysis for Identifying Protein Disorder," cross-disciplinary funding by KDI Knowledge and Distributed Intelligence Initiative, Division of Information and Intelligent Systems and Division of Molecular and Cellular Biosciences, National Science Foundation, Grant IIS-9711532, $379,910.

Page 24: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

• Obradovic, Z. and Dunker, A.K., (January 2000 - May 2001) “Supplement to Intelligent Data Analysis for Identifying Protein Disorder," Knowledge and Cognitive Systems Program, National Science Foundation, $50,858.

• Obradovic, Z. and Dunker, A.K., (January 2000 - December 2000) “REU Supplement to Intelligent Data Analysis for Identifying Protein Disorder," Knowledge and Cognitive Systems Program, National Science Foundation, $15,000.

• Obradovic, Z. (January 2000 - December 2000) “Tools for Analyzing Learned Business Valuation Models and for Construction Higher-Representation Value Driving Attributes," Valueminer.com Inc., $40,000.

• Obradovic, Z. and Fiez, T., (January 1998 - September 2000) “Integration of Distributed Heterogeneous Experts for Knowledge Discovery in Precision Agriculture,"Idaho National Engineering and Environmental Laboratory University Research Consortium, $307,832 .

• Obradovic, Z. (May 1999 - May 2000) “An Intangible Assets Analysis System for Identifying Enterprise Value Drivers," Valueminer.com Inc., $36,000.

• Obradovic, Z., (July 1993 - June 1997) “RIA: Efficient and Accurate Prediction Systems for Large Scale Problems," Knowledge and Cognitive Systems Program, National Science Foundation, $100,000.

• Obradovic, Z. and Meador, J., (July 1991- June 1993) “Parametric Fault Diagnosis in Mixed-Signal Integrated Circuits," National Science Foundation Center for Design of Analog-Digital Integrated Circuits. $90,000.

• Obradovic, Z. (Fall 1997 - Spring 1998) “Predicting Disordered Protein Structure from Amino Acid Sequence - Research Project Supervision for R. Sarac," Howard Hughes Undergraduate Research Fellowship.

• Obradovic, Z. (June 1997 - August 1997) “Intelligent Systems for Data Analysis and Modeling - - Research Project for Teachers," WSU / National Science Foundation Summer Teacher's Institute.

• Obradovic, Z. (June 1996) “Neural Network Design for Medical Applications – Research Project for Students," 1996 WSU / Howard Hughes Summer Science and Engineering Scholars Program.

• Obradovic, Z. (June 1995 - August 1995) “Analysis and Comparison of Prediction Systems for Very Noisy Domains - Research Project for Teachers," WSU / National Science Foundation Summer Teacher's Institute.

• Obradovic, Z. and Drossu, R. (June 1994) “Computer-aided Diagnosis in Medicine – Research Project for Students," 1994 WSU / Howard Hughes Summer Science and Engineering Scholars Program.

• Obradovic, Z., (July 1992 - May 1993) “Neural Networks for Large Learning Problems," Washington State University Research Grant-in-Aid Program 1992-93.

PUBLICATIONS: I. BIOMEDICAL INFORMATICS: Biomedical Informatics: Journal Articles

1. Potireddy, S, Midic, U., Liang, C.G., Obradovic, Z., Latham, K.E. (in press) “Positive and negative cis-regulatory elements directing postfertilization maternal mRNA translational control in mouse embryos,” Am J Physiol Cell Physiol 299.

2. Garriga, J., Xie, H., Obradovic, Z., Grana, X. (2010) “Selective Control of Gene Expression by CDK9 in Human Cells,” Journal of Cellular Physiology, vol. 222(1):200-8.

3. Midic, U., Oldfield, C.J., Dunker, A.K., Obradovic, Z., Uversky, V.N. (2009) “Unfoldomics of Human Genetic Diseases: Examples of Ordered and Intrinsically Disordered Members of the Human Diseasome,” Protein and Peptide Letters, vol. 16, no. 12, pp. 1533-1547.

4. Uversky, V.N., Oldfield, C.J., Midic, U., Xie, H., Xue, B., Vucetic, S., Iakoucheva, L.M., Obradovic, Z., Dunker, A.K., (2009) “Unfoldomics of Human Diseases: Linking Protein Intrinsic Disorder with Diseases,” BMC Genomics, vol. 10 Suppl 1:S07.

5. Midic, U., Oldfield, C.J., Dunker, A.K., Obradovic, Z., Uversky, V.N. (2009) “Protein Disorder in the Human Deseasome: Unfoldomics of Human Genetic Diseases,” BMC Genomics, vol. 10 Suppl 1:S12.

Page 25: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

6. Li, A., Xie H., Chin, M.H., Obradovic, Z., Smith, D.J., Megalooikonomou, V. (2009) “Analysis of Multiplex Gene Expression Maps Obtained by Voxelation,” BMC Bioinformatics, 10 Suppl 4:S10.

7. Megalooikonomou, V., Kontos, D., Pokrajac, D., Lazarevic, A., Obradovic, Z. (2008) “An Adaptive Partitioning Approach for Mining Discriminant Regions in 3D Image Data,” Journal of Intelligent Information Systems, vol 31, no. 3, pp. 217-242.

8. Dunker, K., Oldfield, C.J., Meng, J., Romero, P., Yang, J., Chen, J.W., Vacic, V., Obradovic, Z. and Uversky, V.N. (2008) “The Unfoldomics Decade: An Update on Intrinsically Disordered Proteins,” BMC Genomics, vol. 9 (Suppl 2):S1, 16.

9. Xu, Q., Canutescu, A., Wang, G., Shapavalov, M.V., Obradovic, Z. and Dunbrack, R.L. (2008) “Statistical Analysis of Interfaces in Crystals of Homologous Proteins,” J. Molecular Biology, vol. 381, pp. 487-507 .

10. Ren, S., Uversky, V.N., Chen, Z., Dunker, A.K. and Obradovic, Z. (2008) “Short Linear Motifs recognized by SH2, SH3 and Ser/Thr Kinase domains are conserved in disordered protein regions,” BMC Genomics, vol. 9 (Suppl 2):S26, 9. Sept.

11. Krynetskaia, N., Xie, X., Vucetic, S., Obradovic, Z., Krynetskiy, E. (2008), “High Mobility Group Protein B1 is an Activator of Apoptotic Response to Antimetabolite Drugs,” Molecular Pharmacology, Jan;73(1):260-9.

12. Xie, H., Vucetic, S. Iakoucheva L.M., Oldfield C.J., Dunker, A.K., Uversky, V.N. and Obradovic, Z. (2007) “Functional Anthology of Intrinsic Disorder. I. Biological Processes and Functions of Proteins with Long Disordered Regions,” Journal of Proteome Research, May 4;6(5):1882-98.

13. Vucetic, S., Xie, H., Iakoucheva L.M., Oldfield C.J., Dunker, A.K., Obradovic, Z. and Uversky, V.N. (2007) “Functional Anthology of Intrinsic Disorder. II. Cellular Components, Domains, Technical Terms, Developmental Processes,” Journal of Proteome Research, May 4;6(5):1899-1916.

14. Xie, H., Vucetic, S. Iakoucheva L.M., Oldfield C.J., Dunker, A.K., Obradovic, Z. and Uversky, V.N. (2007) “Functional Anthology of Intrinsic Disorder. III. Ligands, Postranslational Modifications and Diseases Associated with Intrinsically Disordered Proteins,” Journal of Proteome Research, May 4;6(5):1917-1932.

15. Radivojac, P., Iakoucheva, L.M., Oldfield C.J.,, Obradovic, Z., Uversky, V.N., Dunker A.K. (2007) “Intrinsic Disorder and Functional Proteomics,” Biophysical Journal, vol. 92, March 2007, pp. 1439-1456.

16. Midic, U. Dunker, K. and Obradovic, Z. (2007) “Exploring alternative knowledge representations for protein secondary-structure prediction,” Int’l Journal of Data Mining and Bioinformatics, 1(3):286-313.

17. Sickmeier, M., Hamilton, A., LeGall, T. Vacic, V., Cortese, M.S., Uversky, V.N., Tompa, P., Obradovic, Z. and Dunker, A.K. (2007) “DisProt: The Database of Disordered Proteins,” Nucleic Acids Research, 35(Database issue):D786-93.

18. Xu, Q., Canutescu, A., Obradovic, Z. and Dunbrack, R.L. (2006) “ProtBuD: A Database of Biological Unit Structures of Protein Families and Superfamilies,” Bioinformatics, Dec 1;22(23):2876-82.

19. Han, B., Obradovic, Z., Hu, Z.Z., Wu, C. H. and Vucetic, S. (2006) “Substring Selection for Biomedical Document Classification,” Bioinformatics, Dec 1;22(23):2876-82.

20. Romero, P., Zaidi, S., Fang,Y.Y., Uversky, V.N., Radivojac, P., Oldfield, C., Cortese M., LeGall, T., Obradovic, Z. and Dunker, A.K. (2006)“Alternative Splicing in Concert with Protein Intrinsic Disorder Enables Increased Functional Diversity in Multicellular Organisms,” The Proceedings of the National Academy of Sciences, vol. 103, no. 22, 8390-8395, May 30.

21. Peng, K., Radivojac, P., Vucetic, S., Dunker, A.K. and Obradovic, Z. (2006) “Length-Dependent Prediction of Intrinsic Protein Disorder,” BMC Bioinformatics, vol. 7 (1), 208, April 17.

22. Radivojac, P., Vucetic, S., O’Connor, T.R., Uversky, V.N., Obradovic, Z. and Dunker, A.K. (2006) “Calmodulin Signaling: Analysis and Prediction of a Disorder-Dependent Molecular Recognition,” Proteins: Structure, Function and Bioinformatics, vol. 63(2), pp. 398-410, May 1.

Page 26: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

23. Obradovic, Z., Peng, K., Vucetic, S., Radivojac, P., and Dunker, A.K. (2005) “Exploiting Heterogeneous Sequence Properties Improves Prediction of Protein Disorder,” Proteins: Structure, Function and Bioinformatics, vol. 61, Suppl. 7, pp. 176-182.

24. Peng, K., Vucetic, S., Radivojac, P., Brown, C.J., Dunker, A.K. and Obradovic, Z. (2005) “Optimizing Long Intrinsic Disorder Predictors with Protein Evolutionary Information,” Journal of Bioinformatics and Computational Biology, vol. 3, no. 1, pp. 35-60.

25. Vucetic, S., Obradovic, Z., Vacic, V., Radivojac, P., Peng, K., Lawson, J.D., Brown, C.J., Sikes, J.G., Newton, C. and Dunker, A.K. (2005) “Disprot: A Database of Protein Disorder,” Bioinformatics, vol 21, no. 1, pp. 137-40.

26. Pokrajac, D., Megalooikonomou, V., Lazarevic, A., Kontos, D. and Obradovic, Z. (2005) “Applying Spatial Distribution Analysis Techniques to Classification of 3D Medical Images,” International Journal Artificial Intelligence in Medicine, Vol. 33, No 3, pp. 261-80.

27. Romero, P., Obradovic, Z., and Dunker, A.K.(2004) “Natively Disordered Proteins: Functions and Predictions,” Applied Bioinformatics, 3(2-3), pp.105-13.

28. Radivojac, P., Chawla, N. V., Dunker, A.K., and Obradovic, Z. (2004) “Classification and Knowledge Discovery in Protein Databases,” Journal of Biomedical Informatics, vol. 37, pp. 224-239.

29. Iakoucheva, L.M., Radivojac, P., Brown, C.J., O’Connor, T.R., Sikes, J.G., Obradovic, Z. and Dunker, A.K. (2004) “The Importance of Intrinsic Disorder for Protein Phosphorylation,” Nucleic Acids Research, vol. 32, no. 3, pp. 1037-1049.

30. Obradovic, Z, Peng, K, Vucetic, S., Radivojac, P., Brown, C., and Dunker, A.K. (2003) “Predicting Intrinsic Disorder from Amino Acid Sequence,” Proteins: Structure, Function and Genetics, vol. 53 Suppl 6, pp. 566-72.

31. Radivojac, P., Obradovic, Z., Smith D.K., Zhu, G., Vucetic, S., Brown, C., Lawson, J.D. and Dunker, A.K., (2003) “Protein flexibility and intrinsic disorder,” Protein Science, vol. 13, pp. 71-80.

32. Vucetic, S., Brown C., Dunker A.K and Obradovic, Z. (2003) “Flavors of Protein Disorder," Proteins: Structure, Function and Genetics, vol. 52. pp. 573-584

33. Smith, D. K., Radivojac, P., Obradovic, Z., Dunker, A. K. and Zhu, G. (2003) “Improved Amino Acid Flexibility Parameters,” Protein Science, vol 12, pp. 1060-1072.

34. Iakoucheva, L.M., Brown, C.J., Lawson, J.D., Obradovic, Z. and Dunker A.K. (2002) “Intrinsic Disorder in Cell-signaling and Cancer-associated Proteins," Journal of Molecular Biology, vol. 323, pp. 573-584.

35. Dunker, A.K., Brown, C.J., Lawson, J.D., Iakoucheva, L.M. and Obradovic, Z. (2002) “Intrinsic Disorder and Protein Function," Biochemistry, May 28th, vol. 41, issue 21, pp. 6573 - 6582.

36. Dunker, A.K., Brown, C.J. and Obradovic, Z. (2002) “Identification and Functions of Usefully Disordered Proteins," Advances in Protein Chemistry, vol. 62, pp. 25-49.

37. Dunker, A.K and Obradovic, Z. (2001) “The Protein Trinity - Linking Function and Disorder," Nature Biotechnology, vol. 19, Sept., pp. 805-806.

38. Dunker A.K., Lawson J.D., Brown C.J., Romero P., Oh J., Oldfield C.J., Campen A.M., Ratlif, Hipps K.W., Ausio J., Nissen M.S., Reeves R., Kang C.H., Kissinger C.R., Bailey R.W., Griswold M.D., Chiu W., Garner E.C. and Obradovic Z. (2001) “Intrinsically Disordered Proteins," Journal of Molecular Graphics and Modeling, vol. 19, pp. 28-61.

39. Romero, P., Obradovic, Z., Li, X., Garner, E., Brown, C.J. and Dunker, A.K. (2001) “Sequence Complexity and Disordered Protein," Proteins: Structure, Function and Genetics, vol. 42, pp. 38-48.

40. Romero, P., Obradovic, Z and Dunker K. (2000) “Intelligent Data Analysis for Identifying Protein Disorder," Issues on Application of Data Mining, Artificial Intelligence Review, Vol. 14, No. 6, S2, pp. 447-484.

41. Romero, P., Obradovic, Z. and Dunker, A.K. (1999) “Folding Minimal Sequences: The Lower Bound for Sequence Complexity of Globular Proteins," FEBS Letters. vol. 462, pp.363-367.

42. Dunker, A.K., Obradovic, Z., Romero, P., Kissinger, C. and Villafranca, J.E. (1997) “On the Importance of Being Disordered," Protein Data Bank Quarterly Newsletter, Release no. 81, pp. 3-5.

Page 27: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

Biomedical Informatics: Peer Reviewed Book Chapters

43. Xie, H., Obradovic, Z. and Vucetic, S. (2009) “Mining of Microarray, Proteomics, and Clinical Data for Improved Identification of Chronic Fatigue Syndrome,” chapter 9 in McConnell, P, Lim, S., and A.J. Cuticchia, Methods of Micorarray Data Analysis VI. (Scotts Valley, California: CreateSpace Publishing, 2009), pp. 119-127.

44. Xie, H., Midic, U., Vucetic, S. and Obradovic, Z. (2008) “Algorithmic Methods for the Analysis of Gene Expression Data,” chapter 4 in Handbook of Applied Algorithms: Solving Scientific, Engineering, and Practical Problems (eds. A. Nayak and I. Stojmenovic), Willey-IEEE Press, pp. 115-146.

45. Uversky V.N., Radivojac, P., Iakoucheva, L.M., Obradovic, Z. and Dunker, A.K. (2007) “Prediction of Intrinsic Disorder and its Use in Functional Proteomics,” chapter 5 in Methods in Molecular Biology vol. 408: Gene Function Analysis (ed. M. Ochs), Humana Press Inc., Totowa, N.J.

46. Peng, K. Obradovic, Z. and Vucetic, S. (2006) “Supervised Learning under Sample Selection Bias from Protein Structure Databases,” in Advances in Applied and Computational Mathematics, Nova Science Publishers, pp. 153-170.

Biomedical Informatics: Fully Refereed Conference Articles

47. Zhang, P., Obradovic, Z. (2010) “Unsupervised Integration of Multiple Protein Predictors,” Proc. IEEE International Conference on Bioinformatics and Biomedicine, Hong Kong.

48. Li, A., Xie, H., Obradovic, Z., Smith, D.J, Megalooikonomou, V. (2010) “Identify Gene Functions using Functional Expression Profiles obtained by Voxalation,” ACM International Conference on Bioinformatics and Computational Biology, Niagara Falls.

49. Li, A., Obradovic, Z., Smith, D.J., Bodenreider, O., Megalooikonomou, V. (2009) „Mining Association Rules among Gene Functions in Clusters of Similar Gene Expression Maps,” Proc. Workshop on Data Mining in Functional Genomics at the IEEE International Conference on Bioinformatics and Biomedicine, Washington D.C., November 2009.

50. Midic U, Dunker A.K., and Obradovic, Z. (2009) “Protein Sequence Alignment and Intrinsic Disorder: A Substitution Matrix for an Extended Alphabet,” Proc. Workshop on Statistical and Relational Learning and Mining in Bioinformatics at the 15th ACM SIGKDD Conf. Knowledge Discovery and Data Mining, Paris, France, June 2009.

51. Gao, J., Agrawal, G.K., Thelen, J.J., Obradovic, Z., Dunker, A.K., Xu, D. “A New Machine Learning Approach for Protein Phosphorylation Site Prediction in Plants.,” Lecture Notes in Bioinformatics (LNBI 5462), Proc. First Int’l Conf. on Bioinformatics and Computational Biology (BICoB), April 2009, New Orleans, USA, pp. 18-29.

52. Ren, S. and Obradovic, Z. (2008) “Improvement of Survival Prediction from Gene Expression Profiles by Mining of Prior Knowledge,” Proc. IEEE Int’l Conf. on Bioinformatics and Biomedicine, Philadelphia, Nov. 2008.

53. An L., Xie H., Chin M., Obradovic Z., Smith D., Megalooikonomou V., (2008) “Analysis of Multiplex Gene Expression Maps Obtained by Voxelation”Proc. IEEE Int’l Conf. on Bioinformatics and Biomedicine, Philadelphia, Nov. 2008.

54. Dunker, K., Oldfield, C.J., Meng, J. Romero, P., Yang, J., Obradovic, Z. and Uversky, V.N. (2007) “Intrinsically Disordered Proteins: An Update,” Proc. IEEE 7th Int’l Symp. Bioinformatics and Bioengineering, Harvard Medical School, Cambridge, MA, pp. 49-58.

55. Midic, U., Dunker, K. and Obradovic, Z. (2005) “Improving Protein Secondary-Structure Prediction by Predicting Ends of Secondary-Structure Segments,” Proc. 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, San Diego, CA, pp. 490-497.

Page 28: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

56. Peng, K, Vucetic, S. and Obradovic, Z. (2005) “Correcting Sampling Bias in Structural Genomics through Iterative Selection of Underrepresented Targets,” Proc. 5th SIAM Int'l Conf. on Data Mining, Newport Beach, CA, pp.621-625.

57. Xie, H., Vucetic, S., Sun, H., Hedge, P and Obradovic, Z. (2004) “Characterization of Gene Functional Expression Profiles of Plasmodium Falciparum,” Proc. 5th Conf. on Critical Assessment of Microarray Data Analysis, Durham, North Carolina.

58. Radivojac, P., Obradovic, Z., Dunker, A.K. and Vucetic, S. (2004) “Feature Selection Filters Based on Permutation Test,” Proc. 15th European Conference on Machine Learning, Pisa, Italy.

59. Peng, K., Obradovic, Z. and Vucetic, S., (2004) “Towards Efficient Learning of Neural Network Ensembles from Arbitrarily Large Datasets,” Proc. 16th European Conf. on Artificial Intelligence, Valencia, Spain, pp. 623-627.

60. Pokrajac, D., Lazarevic, A., Singleton, T. and Obradovic, Z. (2004) “Localized Neural Network Based Distributional Learning for Knowledge Discovery in Protein Databases,” Proc. Int’l Joint Conf. Neural Networks, Budapest, Hungary.

61. Peng, K., Obradovic, Z. and Vucetic, S., (2004) “Exploring Bias in the Protein Data Bank Using Contrast Classifiers,” Proc. 9th Pacific Symposium on Biocomputing, Hawaii, pp. 435-446.

62. Kontos, D., Megalooikonomou, V., Pokrajac, D., Lazarevic, A., Obradovic, Z., Ford, J., Makedon, F. and Saykin, A.J. (2004) “Extraction of Discriminative Functional MRI Activation Patterns and an Application to Alzheimer’s Disease,” Proc. 7th Int’l Conf. on Medical Image Computing and Computer-Assisted Intervention, Lecture Notes in Computer Science series, Springer, Saint-Malo, France, Lecture Notes in Computer Science 3217, Vol. 2, pp. 727-735.

63. Peng, K., Vucetic, S., Han, B., Xie H. and Obradovic, Z. (2003) “Exploiting Unlabeled Data for Improving Accuracy of Predictive Data Mining," Proc. 3rd IEEE Int’l Conf. Data Mining, Melbourne, Fl, pp. 267-274.

64. Han, B., Vucetic, S. and Obradovic, Z. (2003) “Reranking Medline Citations by Relevance to a Difficult Biological Query," Proc. IASTED Int'l Conf. Neural Networks and Computational Intelligence, Cancun, Mexico, pp. 38-43.

65. Vucetic, S., Pokrajac, D., Xie H. and Obradovic, Z. (2003) “Detection of Underrepresented Biological Sequences Using Class-Conditional Distribution Models," Proc. Third SIAM Int'l Conf. on Data Mining, San Francisco, CA, pp. 279-283.

66. Radivojac, P., Obradovic, Z., Brown, C.J. and Dunker, A.K. (2003) “Prediction of Boundaries Between Intrinsically Ordered and Disordered Protein Regions,” Proc. 8th Pacific Symposium on Biocomputing, Hawaii, pp. 216-227.

67. Radivojac, P., Obradovic, Z., Brown, C.J. and Dunker, A.K. (2002) “Improving Sequence Alignments for Intrinsically Disordered Proteins," Proc. 7th Pacific Symposium on Biocomputing, Hawaii, pp. 589-600.

68. Dunker, A.K., Brown. C.J, Lawson, J.D., Iakoucheva-Sebat, L.M., Vucetic, S. and Obradovic, Z. (2002) “The Protein Trinity: Structure/Function Relationships that Include Intrinsic Disorder,” Proc. 2002 Miami Nature Biotechnology Winter Symp., The Scientific Word, 2(S2), 49-50.

69. Megalooikonomou, V., Pokrajac, D., Lazarevic, A., and Obradovic, Z. (2002) “Effective Classification of 3D Image Data using Partitioning Methods," Proc. SPIE Visualization and Data Analysis 2002 Conf., San Jose, CA, pp. 62-73.

70. Vucetic, S., Radivojac, P., Dunker, A.K., Brown, C.J. and Obradovic, Z. (2001) “Methods for Improving Protein Disorder Prediction," Proc. 2001 IEEE/INNS International Joint Conference on Neural Networks, Washington D.C., vol. 4, pp. 2718-2723. ISBN: 0-7803-7044-9

71. Williams, R.M., Obradovic, Z., Mathura, V., Braun, W., Garner, E.C., Young, J., Takayama, S., Brown, C.J. and Dunker A.K. (2001) “The Protein Non-Folding Problem: Amino Acid Determinants of Intrinsic Order and Disorder," Proc. 6th Pacific Symposium on Biocomputing, Maui, Hawaii, pp. 89-100.

72. Lazarevic, A., Pokrajac, D., Megalooikonomou, V. and Obradovic, Z. (2001) “Distinguishing Among 3-D Distributions for Brain Image Data Classification," Proc. 4th International Conference of Neural Networks and Expert Systems in Medicine and Health Care, Milos Island, Greece, pp. 389-396.

Page 29: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

73. Pokrajac, D., Lazarevic, A., Megalooikonomou, V. and Obradovic, Z. (2001) “Classification of Brain Image Data using Measures of Distributional Distance," Human Brain Mapping, Brighton, UK.

74. Dunker, A.K., Obradovic, Z., Romero, P., Garner, E.C and Brown, C.J. (2000) “Intrinsic Protein Disorder in Complete Genomes," In S. Miyano and T. Takagi (editors) Proc. Genome Informatics 11, Tokyo, Japan, pp. 161-171.

75. Li, X., Obradovic, Z., Brown, C. J., Garner, E. C., Keith A. K. (2000) “Comparing Predictors of Disordered Protein," In S. Miyano and T. Takagi (editors) Proc. Genome Informatics 11, Tokyo, Japan, pp. 172-184.

76. Li, X., Rani, M., Romero, P., Obradovic, Z. and Dunker, A.K. (1999) “Predicting Protein Disordered Regions for N-, C- and Internal Regions," In S. Miyano & T. Takagi (editors) Proc. Genome Informatics 10, Tokyo, Japan, pp. 30-40.

77. Garner, E., Romero, P., C.J. Brown, Obradovic, Z. and Dunker, A.K. (1999) “Predicting Binding Regions within Disordered Proteins," In S. Miyano & T. Takagi (editors) Proc. Genome Informatics 10, Tokyo, Japan, pp. 41-50.

78. Xie, Q., Arnold, G.E., Romero, P., Obradovic, Z., Garner, E and Dunker, A.K. (1998) “The Sequence Attribute Method for Determining Relationships Between Sequence and Protein Disorder," In S. Miyano & T. Takagi (editors) Proc. Genome Informatics 1998, Tokyo, Japan, pp. 193-200.

79. Garner, E., Cannon, P., Romero, P., Obradovic, Z. and Dunker, A.K. (1998) “Predicting Disordered Regions from Amino Sequence: Common Theme Despite Differing Structural Characterization," In S. Miyano & T. Takagi (editors) Proc. Genome Informatics 1998,Tokyo, Japan, pp. 201-213.

80. Rani, M., Romero, P., Obradovic, Z. and Dunker, A.K. (1998) “Annotation of PDB with respect to Disordered Regions in Proteins," In S. Miyano & T. Takagi (editors) Proc. Genome Informatics 1998, Tokyo, Japan, pp. 240-241.

81. Romero, P., Obradovic, Z., Kissinger, C., Villafranca, J.E., Garner, E., Guilliot, S. and Dunker, A.K. (1998) “Thousands of Proteins Likely to Have Long Disordered Regions," Proc. Pacific Symposium on Biocomputing, Maui, Hawaii, vol. 3, pp. 435-446.

82. Dunker, A.K., Garner E., Guilliot S., Romero P., Albrecht K., Hart J., Obradovic Z., Kissinger C., and Villafranca, J.E., (1998) “Protein Disorder and the Evolution of Molecular Recognition: Theory, Predictions and Observations," Proc. Pacific Symposium on Biocomputing, Maui, Hawaii, vol. 3, pp. 471-482.

83. Romero, P., Obradovic, Z and Dunker A.K. (1997) “Sequence Data Analysis for Long Disordered Regions Prediction in the Calcineurin Family," In S. Miyano & T. Takagi (editors) Proc. Genome Informatics 1997, Tokyo, Japan, pp. 110-125.

84. Romero, P., Obradovic, Z., Kissinger, C., Villafranca, J.E. and Dunker, A.K. (1997) “Identifying Disordered Regions in Proteins from Amino Acid Sequence," Proc. IEEE Int. Conf. on Neural Networks, Houston, TX, vol. 1, pp. 90-95.

II. DATA MINING: Spatial and Spatio-Temporal Data Mining: Journal Articles

85. Ouzienko, V., Guo, Y., Obradovic, Z. (in review) “A Decoupled Exponential Random Graph Model for Prediction of Structure and Attributes in Temporal Social Networks.”

86. Mennis, J., Harris, P., Obradovic, Z., Izenman, A., Grunwald, H., and Lockwood, B., (in press) “The effect of neighborhood characteristics and spatial spillover on urban juvenile delinquency and recidivism,” The Professional Geographer.

87. Radosavljevic, V., Vucetic, S., Obradovic, Z. (2010) “A Data Mining Technique for Aerosol Retrieval Across Multiple Accuracy Measures,” IEEE Geoscience and Remote Sensing Letters, vol. 7, no.2, pp. 411-415.

Page 30: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

88. Vucetic, S., Han, B., Mi, W., Li. Z., Obradovic, Z. (2008) “A Data Mining Approach for the Validation of Aerosol Retrievals,” IEEE Geoscience and Remote Sensing Letters, vol. 5, no. 1, pp. 113-117.

89. Han, B., Vucetic, S., Braverman, A. and Obradovic, Z. (2006) “A Statistical Complement to Deterministic Algorithms for Retrieving Aerosol Optical Thickness from Radiance Data,” Engineering Applications of Artificial Intelligence, vol. 19, no. 7, pp. 787-795.

90. Pokrajac, D., Obradovic, Z. (accepted with minor revisions) “Spatial-Temporal Prediction with Partial Attribute Observability," Computers and Geoscience.

91. Pokrajac, D., Hoskinson, R.L. and Obradovic, Z. (2003) “Modeling Spatial-Temporal Data with a Short Observation History," Knowledge and Information Systems. Vol. 5, pp. 368-386.

92. Pokrajac, D., Fiez, T. and Obradovic, Z. (2002) “A Data Generator for Evaluating Spatial Issues in Precision Agriculture," Precision Agriculture. Vol 3, no.3, pp. 259-282.

93. Lazarevic, A. and Obradovic, Z. (2001) “Adaptive Boosting Techniques in Heterogeneous and Spatial Databases," Intelligent Data Analysis, Vol. 5, pp.1-24.

94. Vucetic, S., Fiez, T. and Obradovic, Z. (2000) “Analyzing the Influence of Data Aggregation and Sampling Density on Spatial Estimation," Water Resources Research, Vol. 36 , No. 12 , pp. 3721-3731.

Spatial and Spatio-Temporal Data Mining: Peer Reviewed Book Chapters

95. Han, B., Obradovic, Z. and Vucetic, S. (2008) “Using Statistical Methods to Improve Efficiency and

Accuracy of Aerosol Retrievals,” Chapter 7 in Discrete and Computational Mathematics, Nova Science Publishers, Editors: F. Liu, Gaston M. N'Guerekata, D. Pokrajac, X. Shi, J. Sun, X. Xia, pp. 93-106.

Spatial and Spatio-Temporal Data Mining: Fully Refereed Conference Articles

96. Ouzienko, V. Obradovic, Z., (in review) “Imputation of Missing Links and Attributes in Longitudinal Social Networks.”

97. Radosavljevic, V., Vucetic, S., Obradovic, Z., (in review) “Cooperative Continuous Conditional Random Fields for Structured Prediction.”

98. Lou, Q., Obradovic, Z. (in review) “Modeling Multivariate Spatio-Temporal Data with Large Gaps.” 99. Mathew, G., Obradovic, Z. (2011) “A Privacy-preserving Framework for Distributed Clinical

Decision Support,” Proc. IEEE International Conference on Computational Advances in Bio and Medical Sciences, Orlando, Florida.

100. Jun, G., Ghosh, J., Radosavljevic, V., Obradovic, Z. (2010) “Predicting Ground-Based Aerosol Optical Depth with Satelite Images via Gausian Processes,” Proc. International Conference on Knowledge Discovery and Information Retrieval, Valencia, Spain.

101. Obradovic, Z., Das, D., Radosavljevic, V., Ristovski, K., Vucetic, S. (2010) “Spatio-Temporal Characterization of Aerosols through Active Use of Data from Multiple Sensors,” Proc. International Society for Photogrammetry and Remote Sensing (ISPRS) Technical Commission VII Symposium, July 5-7, Vienna, Austria, ISPRS International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences.

102. Radosavljevic, V., Obradovic, Z., Vucetic, S. (2010) “Continuous Conditional Random Fields for Regression in Remote Sensing,” Proc. 19th

103. Ouzienko, V., Guo, Y., Obradovic, Z. (2010) “Prediction of Attributes and Links in Temporal Social Networks,” Proc. 19

European Conf. on Artificial Intelligence, August, Lisbon, Portugal.

th

104. Ristovski, K., Das, D. Ouzienko, V., Guo, Y., Obradovic, Z. (2010) “Regression Learning with Multiple Noisy Oracles,” Proc. 19

European Conf. on Artificial Intelligence, August, Lisbon, Portugal.

th

105. Lou, Q., Obradovic, Z. (2010) “Feature Selection by Approximating the Markov Blanket in a Kernel-Induced Space,” Proc. 19

European Conf. on Artificial Intelligence, August, Lisbon, Portugal.

th European Conf. on Artificial Intelligence, August, Lisbon, Portugal.

Page 31: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

106. Das, D., Obradovic, Z., Vucetic, S. (2009) “Active Selection of Sensor Sites in Remote Sensing Applications,” Proc. IEEE International Conference on Data Mining, December, Miami, FL. pp. 758-763.

107. Radosavljevic, V., Vucetic, S., Obradovic, Z. (2009) “Reduction of Ground-Based Sensor Sites for Spatio-Temporal Analysis of Aerosols,” Proc. 3rd International Workshop on Knowledge Discovery from Sensor Data at the 15th

108. Ristovski, K., Vucetic, S., Obradovic, Z. (2009) “Evaluation of a Neural Networks based Approach for Aerosol Optical Depth Retrieval and Uncertainty Estimation,” Proc. Int’l Conf. on Space Technology, Thessaloniki, Greece, Aug. 2009.

ACM SIGKDD Conf. Knowledge Discovery and Data Mining, Paris, France, June 2009.

109. Ayuyev, V., Jupin, J., Harris, P. and Obradovic, Z. (2009) “Dynamic Clustering Based Estimation of Missing Values in Mixed Type Data,” Proc. 11th

110. Das, D., Radosavljevic, V., Vucetic, S., Obradovic, Z. (2008) “Reducing Need for Collocated Ground and Satellite based Observations in Statistical Aerosol Optical Depth Estimation,” IEEE Int’l Geoscience and Remote Sensing Symposium, July, Boston, MA.

Int’l Conf. on Data Warehousing and Knowledge Discovery, Linz, Austria, Sept. 2009, pp. 366-377.

111. Radosavljevic, V., Vucetic, S., Obradovic, Z. (2008) “Spatio-Temporal Partitioning for Improving Aerosol Prediction Accuracy,” Proc. Eight SIAM Int’l Conf. on Data Mining, April 24-26, 2008, Atlanta, GA, USA.

112. Zhuang, W., Radosavljevic, Han, B., Obradovic, Z., Vucetic, S. (2008) “Aerosol Optical Depth Prediction from Satellite Observations by Multiple Instance Regression,” Proc. Eight SIAM Int’l Conf. on Data Mining,, Atlanta, GA, USA, 2008.

113. Radosavljevic, V., Vucetic, S., Obradovic, Z. (2007) “Aerosol Optical Depth Retrieval by Neural Network Ensembles with Adaptive Cost Function,” Proc. 10th Int’l Conf. Engineering Applications of Neural Networks,” Thessaloniki, Greece, Aug. 2007, pp. 266-275.

114. Han, B., Obradovic, Z, Li, Z. and Vucetic, S., (2006) “Data Mining Support for Improvement of MODIS Aerosol Retrievals,” Proc. IEEE Int’l Geoscience and Remote Sensing Symp., Denver, CO, Aug. 2006.

115. Obradovic, Z, Han, B., Xu, Q., Li, Y., Braverman, A., Li, Z. and Vucetic, S. (2006) “Data Mining Support for Aerosol Retrieval and Analysis – Project Summary,” NASA Data Mining Workshop, Pasadena, CA, May 2006.

116. Qin, Y. and Obradovic, Z. (2006) “Efficient Learning from Massive Spatial-Temporal Data through Selective Support Vector Propagation,” Proc. 17th European Conf. on Artificial Intelligence, Riva Del Garda, Italy.

117. Han, B., Vucetic, S., Braverman, A. and Obradovic, Z (2005) “Integration of Deterministic and Statistical Algorithms for Aerosol Retrieval,” Proc. International Conference on Novel Applications of Neural Networks in Engineering, Lillie, France, Aug. 2005, pp. 85-92.

118. Han, B., Vucetic, S., Braverman, A. and Obradovic, Z. (2005) “Construction of an accurate geospatial predictor by fusion of global and local models,” Proc. IEEE 8th International Conference on Information Fusion, B.11.2 pp. 1-8, Philadelphia, PA, July 2005.

119. Xu, Q., Han, B., Li, Y., Braverman, A., Obradovic, Z. and Vucetic, S. (2005) “Improving aerosol retrieval performance by integrating AERONET, MISR, and MODIS data products,” Proc. IEEE 8th International Conference on Information Fusion, B.11.3 pp. 1-8, Philadelphia, PA, July 2005.

120. Pokrajac, D., Hoskinson, R., Lazarevic, A., Obradovic, Z. (2002) “Spatial-Temporal Techniques for Prediction and Compression of Soil Fertility Data," Proc. 6th International Conference on Precision Agriculture, Minneapolis, MN.

121. Hoskinson, R., Pokrajac, D., Obradovic, Z., Lazarevic, A. (2002) “The Unpredictability of Soil Fertility across Space and Time," Proc. 6th International Conference on Precision Agriculture, Minneapolis, MN.

122. Pokrajac, D. and Obradovic, Z. (2001) “Improved Spatial-Temporal Forecasting through Mining,” Proc. First SIAM Int’l Conf. on Data Mining,, April 5-7, 2001, Chicago, USA.

123. Vucetic S. and Obradovic Z. (2000) “Discovering Homogeneous Regions in Spatial Data through Competition," Machine Learning: Proc. of the 17th Int'l. Conf., Stanford, CA, June 2000, pp. 1095-1102.

Page 32: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

124. Pokrajac D. and Obradovic Z. (2000) “Combining Regressive and Auto-Regressive Models for Spatial-Temporal Prediction," Machine Learning of Spatial Knowledge Workshop at the 17th Int'l. Conf. on Machine Learning, Stanford, CA, June 2000.

125. Pokrajac, D. and Obradovic, Z. (2000) “Learning Heterogeneous Functions from Sparse and Non-Uniform Samples," Proc. IEEE-INNS-ENNS Int'l Joint Conf. on Neural Networks, Como, Italy, July 2000.

126. Pokrajac, D., Obradovic, Z. and Fiez, T. (2000) “Understanding the Influence of Noise, Sampling Density and Data Distribution on Spatial Prediction Accuracy," Track on Simulation Methodology and Control Engineering and Artificial Intelligence, R. V. Landeghem (Ed.): Proc. 14th European Simulation Multiconference - Simulation and Modeling: Enablers for a Better Quality of Life, May 23-26, 2000, Ghent, Belgium. SCS Europe 2000, ISBN 1-56555-204-0, pp. 706-708.

127. Pokrajac, D., Fiez, T. and Obradovic, Z. (2000) “A Tool for Controlled Knowledge Discovery in Spatial Domains," Track on Simulation Methodology, Tools and Standards, R. V. Landeghem (Ed.): Proc. 14th European Simulation Multiconference - Simulation and Modeling: Enablers for a Better Quality of Life, May 23-26, 2000, Ghent, Belgium. SCS Europe 2000, ISBN 1-56555-204-0, pp. 26-32.

128. Lazarevic, A. Fiez, T. and Obradovic, Z. (2000) “Adaptive Boosting for Spatial Functions with Unstable Driving Attributes," Proc. Pacific-Asia Conference on Knowledge Discovery and Data Mining, Kyoto, Japan, April 2000, Computer Science Editorial 3, Springer-Verlag, pp. 329-340.

129. Lazarevic, A. Fiez, T. and Obradovic, Z. (2000) “A Software System for Spatial Data Analysis and Modeling," Proc. Data Mining Minitrack at the IEEE Hawaii Int'l Conf. On System Sciences, IEEE Computer Society Press, January 2000.

130. Pokrajac, D., Lazarevic, A., Vucetic, S., Fiez T. and Obradovic Z. (1999) “Image Processing in Precision Agriculture," Proc. IEEE Int'l Conf. on Telecommunications in Modern Satellite, Cable and Broadcasting Services, Nis, Yugoslavia, October 1999, IEEE Press, v.2, pp. 616-619.

131. Pokrajac, D., Fiez, T., Obradovic, D., Kwek, S. and Obradovic, Z. (1999) “Distribution Comparison for Site-Specific Regression Modeling in Agriculture,' Proc. IEEE/INNS Int'l Joint Conf. on Neural Networks, IEEE Press, ISBN 0-7803-5532-6, Washington D.C., July 1999, No. 346, Session 10.9.

132. Lazarevic, A., Xu, X., Fiez, T. and Obradovic, Z. (1999) “Clustering-Regression-Ordering Steps for Knowledge Discovery in Spatial Databases," Proc. IEEE/INNS Int'l Joint Conf. on Neural Networks, IEEE Press, ISBN 0-7803-5532-6, Washington D.C., July 1999, No. 345, Session 8.1B.

133. Vucetic, S., Fiez, T. and Obradovic, Z. (1999) “A Data Partitioning Scheme for Spatial Regression," Proc. IEEE/INNS Int'l Joint Conf. on Neural Networks, IEEE Press, ISBN 0-7803-5532-6,Washington D.C., July 1999, No. 348, Session 8.1A.

Spatial and Spatio-Temporal Data Mining: Invited Articles

134. Drossu, R., Fiez, T., Lazarevic, A., Pokrajac, D., Vucetic, S., and Obradovic, Z. (1998) “Use of Terrain Analysis in Yield Map Interpretation," Geographical Information Systems in Agriculture Conference, Orlando, Florida. Parallel and Distributed Data Mining

Predictive Data Mining Methods: Journal Articles:

135. Delibasic, B., Jovanovic, M., Vukicevic, M., Suknovic, M., Obradovic, Z. (in press) “Component-based decision trees for classification,” Intelligent Data Analysis, vol. 15 (5).

136. Suknovic, M., Delibasic, B., Jovanovic, M., Vukicevic, M., Becajski-Vujaklija, D., Obradovic, Z. (in press) “Reusable components in decision trees induction algorithms,” Computational Statistics.

137. Jones, P.R., Schwartz, D., Schwartz, I.M., Obradovic, Z., Jupin, J., (2007) “Risk Classification and Juvenile Dispositions: What is the State of the Art?” Temple Law Review, vol. 79, no. 2., pp. 461-498.

138. Vucetic, S. and Obradovic, Z. (2005) “Collaborative Filtering Using a Regression-Based Approach," Knowledge and Information Systems, Vol. 7, No. 1, pp. 1-22.

Page 33: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

139. Pokrajac, D., Lazarevic, A. and Obradovic, Z. (2001) “Exploration-Exploitation Trade-Off in Machine Learning," Facta Universitatis, Ser. Elec. and Energ., vol. 14, no. 1, pp. 67-90.

Predictive Data Mining Methods: Peer Reviewed Book Chapters:

140. Schwartz, I.M., Jones, P.R., Schwartz, D., Obradovic, Z. (2008) “Improving Social Work Through the Use of Technology and Advanced Research Methods,” chapter 13 in Child Welfare Research: Advances for Practice and Policy (eds. Lindsey, D. and Shlonsky, A.) Oxford, pp. 214-230.

141. Obradovic, Z. and Vucetic, S. (2004) “Challenges in Scientific Data Mining: Heterogeneous, Biased, and Large Sample,” a peer reviewed book chapter at The Next Generation Data Mining (editors: H. Kargupta, A. Joshi, K. Sivakumar, Y. Yesha), AAAI/MIT Press, pp. 381-401.

Predictive Data Mining Methods: Fully Refereed Conference Articles:

142. Mathew, G. and Obradovic, Z. (2010) “Vocabularies in Collaboration Channels,” Proc. IEEE 6th

143. Song, M., Song, I.Y., Allen, R.B and Obradovic, Z. (2006) “Improving Retrieval Performance by Automatic Query Expansion with Keyphrases and POS Phrase Categorization, Proc. 6

Int.’l Conf. on Collaborative Computing: Networking, Applications and Worksharing, Chicago, IL.

th

144. Radivojac, P., Sivalingam, K. and Obradovic, Z. (2003) “Learning from Class-Imbalanced Data in Wireless Sensor Networks,” Proc. IEEE Semiannual Vehicular Technology Conference Fall 2003, Orlando, Fl.

ACM/IEEE-CS Joint Conf. Digital Libraries, Chapel Hill, NC.

145. Vucetic, S. and Obradovic, Z. (2001) “Classification on data with biased class distribution," Proc. 12th European Conf. on Machine Learning, Freiburg, Germany, pp. 527-538.

146. Vucetic S. and Obradovic Z. (2000) “A Regression-Based Approach for Scaling-Up Personalized Recommender Systems in E-Commerce," Web Mining for E-Commerce Workshop at the Sixth ACM SIGKDD Inl'l Conf. on Knowledge Discovery and Data Mining, Boston, MA.

147. Vucetic, S. and Obradovic, Z. (2000) “Performance Controlled Data Reduction for Knowledge Discovery in Distributed Databases," Proc. Pacific-Asia Conference on Knowledge Discovery and Data Mining, Kyoto, Japan, April 2000, Computer Science Editorial 3, Springer-Verlag, pp. 29-39.

148. Obradovic, D. and Obradovic Z. (1999) “Efficient Probability Density Balancing for Supporting Distributed Knowledge Discovery in Large Databases," Proc. IEEE/INNS Int'l Joint Conf. on Neural Networks, IEEE Press, ISBN 0-7803-5532-6, Washington D.C., No. 347, Session 8.1B.

Parallel and Distributed Data Mining: Journal Articles

149. Lazarevic, A. and Obradovic, Z. (2002) “Knowledge Discovery in Multiple Spatial Databases," Neural Computing and Applications, vol 10. no. 4, pp. 339-350.

150. Lazarevic, A. and Obradovic, Z. (2002) “Boosting Algorithms for Parallel and Distributed Learning," Distributed and Parallel Databases: An International Journal, Special Issue on Parallel and Distributed Data Mining, vol. 2, pp. 203-229.

151. Obradovic, Z. and Mehr, I., (1996) “Parallel Neural Network Learning Through Repetitive Bounded Depth Trajectory Branching," Neural, Parallel and Scientific Computations, vol. 4, no. 4, pp. 475-491.

Parallel and Distributed Data Mining: Fully Refereed Conference Articles

Page 34: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

152. Lazarevic, A. and Obradovic, Z. (2001) “Data Reduction using Multiple Models Integration," Principles of Knowledge Discovery in Databases, Proc. 5th European Conf., Freiburg, Germany, pp. 301-313.

153. Lazarevic, A. and Obradovic, Z. (2001) “The Distributed Boosting Algorithm," Proc. 7th

154. Lazarevic, A. and Obradovic, Z. (2001) “The Effective Pruning of Neural Network Ensembles," Proc. 2001 IEEE/INNS International Joint Conference on Neural Networks, Washington D.C., pp. 796-801.

ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining, San Francisco, CA, pp. 311-316.

155. Lazarevic, A. and Obradovic, Z. (2001) “Boosting Localized Classifiers in Heterogeneous Databases," Proc. First SIAM Int'l Conf. on Data Mining, April 5-7, Chicago, USA.

156. Lazarevic, A., Pokrajac, D., and Obradovic, Z. (2000) “Distributed Clustering and Local Regression for Knowledge Discovery in Multiple Spatial Databases," Proc. 8th European Symposium on Artificial Neural Networks, Bruges, Belgium, April 2000, pp. 129-134.

157. Venkateswaran, R. and Obradovic, Z., (1994) “Efficient Learning through Cooperation," Proc. World Congress on Neural Networks, San Diego, CA, vol. 3, pp. 390-395.

158. Mehr, I., and Obradovic, Z., (1994) “Parallel Neural Network Learning Through Repetitive Bounded Depth Trajectory Branching," Proc. 8th IEEE Int. Parallel Processing Symposium, Cancun, Mexico, pp. 784-791.

159. Fletcher, J. and Obradovic, Z., (1993) “Parallel Constructive Neural Network Learning," Proc. 2nd IEEE Int. Symp. on High-Performance Distributed Computing, Spokane, WA, pp. 174-178.

Parallel and Distributed Data Mining: Invited and Non-referees Articles

160. Lazarevic, A., Pokrajac, D. and Obradovic, Z. (2000) “An E-commerce System for Mining Distributed Spatial Databases," Int'l Conf. on Advances in Infrastructure for Electronic Business, Science, and Education on the Internet, L'Aquila, Italy, August 2000 (by invitation conference).

161. Mehr, I., Obradovic, Z. and Venkateshwaran, R., (1994) “Parallel and Distributed Gradient Descent Learning," Notes of the Neural Networks Workshop for the Hanford Community, Pacific Northwest Laboratory, Richland, WA, pp. 31-38.

Time Series Analysis: Journal Articles:

162. Vucetic, S., Obradovic, Z. and Tomsovic, K. (2001) “Price-Load Relationships in California's Electricity Market," IEEE Trans. on Power Systems, Vol. 16, No. 2, pp. 280-286.

163. Obradovic, Z. and Chenoweth, T., (1996) “Selection of Learning Algorithms for Trading Systems Based on Biased Estimators." Heuristics, The Journal of Intelligent Technologies, vol. 9, no. 1, pp. 9-21.

164. Chenoweth, T., Obradovic, Z. and Lee, S., (1996) “Embedding Technical Analysis into Neural Network Based Trading Systems," Applied Artificial Intelligence, Taylor & Francis, Washington D.C., vol. 10, no. 6., pp. 523-541.

165. Drossu, R. and Obradovic, Z., (1996) “Regime Signaling Techniques for Non-stationary Time Series Forecasting." Journal of Computational Intelligence in Finance, Finance & Technology Publishing, vol. 4, no. 5, pp. 7-15.

166. Drossu, R. and Obradovic, Z., (1996) “Rapid Design of Neural Networks for Time Series Prediction," IEEE Computational Science and Engineering, vol. 3, no. 2, pp. 78-89.

167. Chenoweth, T. and Obradovic, Z., (1996) “A Multi-Component Nonlinear Prediction System for the S&P 500 Index," Neurocomputing, vol. 10, no. 3, pp. 275-290.

168. Chenoweth, T. and Obradovic, Z., (1995) “An Explicit Feature Selection Strategy for Predictive Models of the S&P 500 Index," Journal of Computational Intelligence in Finance, Finance & Technology Publishing, vol. 3, no. 6, pp. 14-21.

Page 35: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

169. Perez, L.G., Flechsig, A.J., Meador, J.L. and Obradovic, Z., (1994) “Training an Artificial Neural Network to Discriminate Between Magnetizing Inrush and Internal Faults," IEEE Trans. on Power Delivery, vol. 9, no. 1., pp. 434-441.

Time Series Analysis: Peer Reviewed Book Chapters

170. Drossu, R. and Obradovic, Z. (2000) “Data Mining Techniques for Designing Efficient Neural Network Time Series Predictors," peer reviewed book chapter no. 10 in Cloete, I. and Zurada, J. Knowledge-Based Neurocomputing, MIT Press, ISBN 0-262-03274-0, pp. 325-368.

171. Drossu, R. and Obradovic, Z. (1997) “An Analysis of the INFFC Cotton Futures Time Series: Lower Bounds and Testbed Design Recommendations," in Caldwell, B. R., (editor) Nonlinear Financial Forecasting: The First Nonlinear Financial Forecasting Competition, Finance & Technology Publishing, pp. 241-261.

172. Drossu, R. and Obradovic, Z. (1996) “Prediction Horizon Effects on Stochastic Modeling Hints for Neural Networks," In P.E. Keller, S.Hashem, L.J. Kangas, and R. T. Kouzes (editors) Applications of Neural Networks in Environment, Energy, and Health

173. Drossu, R., Lakshman, T.V., Obradovic, Z. and Raghavendra C.S., (1995) “Single and Multiple Frame Video Traffic Prediction Using Neural Network Models," In Raghavan S.V. and Jain B.N. (editors) Computer Networks, Architecture and Applications, Chapman & Hall, 1995, chapter 9, pp. 146-158.

Time Series Analysis: Fully Refereed Conference Articles

174. Vucetic, S. and Obradovic, Z. (2000) “A Constructive Competitive Regression Method for Analysis and Modeling of Non-stationary Time Series," Proc. the First Int'l Workshop on Computational Intelligence in Economics and Finance at the Fifth Int'l Conf. On Information Science, Atlantic City, N.Y., USA, vol. 2, pp. 978-981.

175. Drossu, R. and Obradovic, Z., (1997) “INFFC Data Analysis: Lower Bounds and Testbed Design Recommendations," Proc. 1997 Computational Intelligence in Financial Engineering, New York, N.Y., pp. 71-74.

176. Drossu, R. and Obradovic, Z., (1997) “Regime Signaling Techniques for Non-stationary Time Series Forecasting," Proc. Chaotic and Complex Systems Minitrack at the 30-th Hawaii Int'l Conf. on System Sciences, IEEE Computer Society Press, vol. 5, pp. 530- 538.

177. Drossu, R. and Obradovic, Z., (1995) “Novel Results on Stochastic Modelling Hints for Neural Network Prediction," Proc. World Congress on Neural Networks, Washington, D.C., vol. 3, pp. 230-233.

178. Drossu, R. and Obradovic, Z., (1995) “Stochastic Modeling Hints for Neural Network Prediction," Proc. World Congress on Neural Networks, Washington, D.C., vol. 2, pp. 16-19 and 88-91.

179. Drossu, R. and Obradovic, Z., (1995) “Prediction Horizon Effects on Stochastic Modeling Hints for Neural Networks," Proc. the Workshop on Environmental and Energy Applications of Neural Networks, Pacific Northwest Laboratory, Richland, WA, World Scientific Publishing.

Time Series Analysis: Lightly Refereed Conference Articles

180. Obradovic, Z. and Chenoweth, T. (1996) “Selection of Learning Algorithms for Trading Systems Based on Biased Estimators - An Abstract," Working Notes of the 1996 AAAI Workshop on Integrating Multiple Learned Models for Improving and Scaling Machine Learning Algorithms, held in conjunction with National Conference on Artificial Intelligence AAAI, Portland, OR, pp. 93-94.

181. Obradovic, Z. and Vucetic, S. (1999) “Time Series Method for Forecasting Electricity Market Pricing" in Intelligent Systems in Electricity Market Modeling session, IEEE Power Engineering Society 1999 Summer Meeting, Edmonton, Canada.

III. KNOWLEDGE SYSTEMS:

Page 36: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

Hybrid Knowledge Systems: Journal Articles

182. Obradovic, Z. and Srikumar, R. (2001) “Parallelizing Design of Application Tailored Neural Networks," Facta Universitatis, Ser. Mathematics and Informatics, vol. 16, pp. 97-108.

183. Obradovic, Z. and Srikumar, R. (2000) “Constructive Neural Networks Design Using Genetic Optimization," Facta Universitatis, Ser. Mathematics and Informatics, vol. 15, pp. 133-146

184. Obradovic, Z. (1997) “Guest Editorial: Hybrid Intelligence for Financial Forecasting,” NeuroVest Journal, vol. 5, no. 1, pp. 4-5.

185. Drossu, R., Obradovic, Z. and Fletcher, J. (1996) “A Flexible Graphical User Interface for Embedding Heterogeneous Neural Network Simulators," IEEE Trans. on Education, special issue on Applications of Information Technology, volume 39, no. 3, pp. 367-374.

186. Fletcher, J. and Obradovic, Z., (1995) “A Discrete Approach to Constructive Neural Network Learning," Neural, Parallel and Scientific Computations, vol. 3, no. 3, pp. 307-320.

187. Fletcher, J. and Obradovic, Z. (1993) “Combining Prior Symbolic Knowledge and Constructive Neural Networks," Connection Science: Journal of Neural Computing, Artificial Intelligence and Cognitive Research, vol. 5, nos. 3 & 4, pp. 365-375.

Hybrid Knowledge Systems: Fully Refereed Conference Articles

188. Obradovic, Z. and Chenoweth, T. (1996) “Selection of Learning Algorithms for Trading Systems Based on Biased Estimators," Proc. Adaptive Distributed Parallel Computing Conference, Dayton, OH, pp. 458-467.

189. Romero, P. and Obradovic, Z. (1995) “Comparison of Symbolic and Connectionist Approaches to Local Experts Integration," Proc. the IEEE Technical Applications Conference at Northcon 95, Portland, OR, pp. 105-110.

190. Chenoweth, T., Obradovic, Z., and Lee, S. (1995) “Technical Trading Rules as a Prior Knowledge to a Neural Networks Prediction System for the S&P 500 Index," Proc. The IEEE Technical Applications Conference at Northcon 95, Portland, OR, pp. 111-115.

191. Chenoweth, T. and Obradovic, Z., (1995) “A Multi-Component Approach to Stock Market Prediction," Proc. 3rd

192. Chenoweth, T. and Obradovic, Z., (1994) “Feature Selection for Predictive Models of the Stock Market," Proc. 2

Int. Conf. on Artificial Intelligence on Wall Street, New York, N.Y., pp. 74-79.

nd

193. Obradovic, Z. and Fletcher, J., (1993) “Integration of Knowledge-Based and Constructive Learning Neural Networks," Proc. 1993 World Congress on Neural Networks, Portland, OR, vol. 1, pp. 589-592.

Int. Workshop Neural Networks in the Capital Market, Pasadena, CA.

194. Obradovic, Z. and Fletcher, J. (1992) “Integration of Knowledge-Based and Constructive Learning Neural Networks," Notes of the 1992 AAAI Workshop on Integrating Neural and Symbolic Processes, held in conjunction with National Conference on Artificial Intelligence AAAI, San Jose, CA.

Hybrid Knowledge Systems: Lightly Refereed Conference Articles

195. Fletcher, J. and Obradovic, Z. (1992) “Creation of Neural Networks by Hyperplane Generation from Examples Alone," Notes of the Neural Networks for Learning, Recognition, and Control Research Conference, G. A. Carpenter and S. Grossberg (eds.), the Wang Institute of Boston University.

Hybrid Knowledge Systems: Invited Articles

196. Obradovic, Z. (1997) “Guest Editorial: Hybrid Intelligence for Financial Forecasting," Journal of Computational Intelligence in Finance, vol. 5, no. 1, pp. 4-5.

Page 37: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

197. Obradovic, Z. (1998) “Embedding Prior Knowledge to Statistical Learning Systems for Efficient Knowledge Discovery in Large Databases," Symposium on Contemporary Mathematics, Belgrade, Yugoslavia.

198. Fletcher, J. and Obradovic, Z., (1994) “Integrating a Parallel Constructive Neural Network Algorithm with an Expert System," Notes of the Neural Networks Workshop for the Hanford Community, Pacific Northwest Laboratory, Richland, WA, pp. 58-66.

Hybrid Knowledge Systems: Peer Reviewed Book Chapters

199. Romero, P., Obradovic, Z. and Fletcher J. (2000) “Integration of Heterogeneous Sources of Partial Domain Knowledge," peer reviewed book chapter no. 7 in Cloete, I. and Zurada, J. Knowledge-Based Neurocomputing, MIT Press, pp. 217-250.

Neural Networks: Journal Articles

200. Pokrajac, D., Milutinovich, J. and Obradovic, Z. (2005) “Toward Neural Network-Based Profit Optimization," Facta Universitatis, Series Economics and Organization, vol. 2, no. 3, pp. 261-275.

201. Obradovic, Z., (1996) “Computing with Nonmonotone Multivalued Neurons," Multiple Valued Logic, vol. 1, no. 4, pp. 271-294.

202. Obradovic, Z. and Parberry, I. (1994) “Learning with Discrete Multi-Valued Neurons," Journal of Computer and System Sciences, vol. 49, no. 2, pp. 375-390.

203. Obradovic, Z. and Parberry, I. (1992) “Computing with Discrete Multi-Valued Neurons," Journal of Computer and System Sciences, vol. 45, no. 3, pp. 471-492.

204. Obradovic, Z. and Yan, P., (1990) “Small Depth Polynomial Size Neural Networks," Neural Computation, vol. 2, no. 4, pp. 402-404.

Neural Networks: Fully Refereed Conference Articles

205. Jovanovic, N., Milutinovic, V. and Obradovic, Z. (2002) “Foundations of Predictive Data Mining," Proc IEEE 6th

206. Pokrajac, D. and Obradovic, Z. (2001) “Neural Network-Based Method for Site-Specific Fertilization Recommendation," Proc. Society for Engineering in Agricultural, Food, and Biological Systems (ASAE) Annual International Meeting, 2001.

Conf. on Neural Networks Applications in Electrical Engineering, Belgrade, Yugoslavia, pp. 53-58.

207. Ngom A., Obradovic, Z. and Stojmenovic, I. (1998) “Minimization of Multivalued Multithreshold Perceptrons Using Genetic Algorithms," The 28th

208. Milenkovic, S., Obradovic, Z. and Litovski, V. (1996) “Annealing Based Dynamic Learning in Second-Order Neural Networks," Proc. IEEE Int. Conf. on Neural Networks, Washington D.C., pp. 458-463.

IEEE Int'l. Symp. On Multiple-Valued Logic, Fukuoka, Japan, pp. 209-214.

209. Drossu, R., Obradovic, Z. and Fletcher, J. (1996) “A Flexible Graphical User Interface to Heterogeneous Neural Network Simulators," Proc. 10th European Simulation Multiconference, Int. Society for Computer Simulation, Budapest, Hungary, pp. 273-278.

210. Venkateswaran, R., Obradovic, Z., and Raghavendra, C.V. (1996) “Cooperative Genetic Algorithm for Optimization Problems in Distributed Computer Systems," Proc. 2nd

211. Drossu, R., Lakshman, T.V., Obradovic, Z. and Raghavendra C.S. (1994) “Neural Network Techniques for Video Traffic Prediction," Proc. 6

Online Workshop on Evolutionary Computation, March 11-22, 1996, pp. 49-52. Also at WWW location http://www.bioele.nuee.nagoya-u.ac.jp/wec2/papers/p015.html.

th

212. Fletcher, J. and Obradovic, Z. (1994) “Constructively Learning a Near-Minimal Neural Network Architecture," Proc. IEEE Int’l. Conf. on Neural Networks, Orlando, FL, pp. 204-208.

Int’l. Workshop on Packed Video, Portland, pp. D.9.1-D.9.4.

Page 38: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

213. Obradovic, Z. and Srikumar, R. (1994) “Evolutionary Design of Application Tailored Neural Networks," Proc. IEEE Int’l. Symp. Evolutionary Computation, Orlando, FL, pp. 284-289.

214. Perez, L.G., Flechsig, A.J., Meador, J.L. and Obradovic, Z. (1993) “Training an Artificial Neural Network to Discriminate Between Magnetizing Inrush and Internal Faults," IEEE Power Engineering Society 1993 Winter Meeting.

215. Obradovic, Z. and Parberry, I. (1990) “Analog Neural Networks of Limited Precision I:Computing with Multilinear Threshold Functions," in Advances in Neural Information Processing Systems 2, ed. D.S. Touretzky, San Mateo, CA: Morgan-Kaufmann, pp. 702- 709.

216. Obradovic, Z. and Parberry, I. (1990) “Learning with Discrete Multi-Valued Neurons," Machine Learning: Proc. 7th

217. Pokrajac, D. and Obradovic, Z. (2001) “Neural network-based software for precision farming," Proc. 2001 IEEE/INNS International Joint Conference on Neural Networks, Washington D.C.

Int’l. Conf., ed. B. W. Porter and R.J. Mooney, Austin, TX, Morgan-Kaufmann, pp. 392-399.

218. Perez, L.G. Flechsig, A.J.,Meador, J.L. and Obradovic,Z. (1993) "Training an Artificial Neural Network to Discriminate Between Magnetizing Inrush and Internal Faults," IEEE Power engineering Society 1993 Winter Meeting.

Neural Network: Lightly Refereed Conference Articles

219. Milenkovic, S., Litovski V. and Obradovic Z. (1996) “Nondeterminism in Artificial Neural Networks," Proc. Int’l. Memorial Conference “D.S.Mitrinovic", Nis, Yugoslavia.

220. Milenkovic, S., Litovski, V. and Obradovic, Z., (1993) “A New Adaptive Move Selection in Simulating Annealing," Proc. 15-16 Int. Annual School on Semiconductor and Hybrid Technologies, pp. 22-31, Sozopol, Bulgaria, 13-17 May, 1992-1993.

221. Meador, J. and Obradovic, Z., (1992), “A Connectionist AI Approach to Automatic Test," IEEE Pacific Test Workshop, Whistler, BC, Canada.

222. Obradovic, Z. and Srikumar, R., (1992) “Dynamic Evaluation of a Backup Hypothesis," Notes of the Neural Networks for Learning, Recognition, and Control Research Conference, G. A. Carpenter and S. Grossberg (eds.), the Wang Institute of Boston University.

223. Palmer, D., Obradovic, Z. and Allison, C. (1992) “Determining the Cause for Poor Performance of a Classification Learning System," Notes of the Neural Networks for Learning, Recognition, and Control Research Conference, G. A. Carpenter and S. Grossberg (eds.), the Wang Institute of Boston University.

Other Refereed Journal Articles:

224. Obradovic, Z., Potkonjak, M. and Obradovic, M. (1987) “Design of Efficient Algorithms for VLSI

Systolic Arrays," Informatica, vol. 21, pp. 153-159.

Other Refereed Conference Articles:

225. Obradovic, Z. and Obradovic, M., (1989) “Design of a New Parallel Language and Compiler Development," Proc. 11-th Int’l Symposium Computer at the University, Cavtat, pp. 3.6.1- 3.6.6.

226. Obradovic, Z. and Potkonjak, M., (1987) “Software Speed-up of VLSI Systolics with Idle Cells," Proc. 9th

227. Obradovic, Z. and Potkonjak, M., (1986) “A New Heuristic Algorithm for Solving Travelling Salesman Problem and Similar Problems," Proc. 8

Int’l. Symposium Computer at the University, Cavtat, pp. 2S.01.1-2S.01.4.

th Int’l. Symposium Computer at the University, Cavtat, vol. I, pp. 37.1-37.7.

Page 39: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

228. Protic, V., Mladenovic, B. and Obradovic, Z., (1986) “An Environment for Microcomputer Development, Testing and Installation," Proc. 10-th BIH Symposium on Informatics, Jahorina, pp. 187.1-187.8.

Editorial Articles: 229. Obradovic, Z. and Liu, H. (2009) “Editorial: Special Issue on the Best of SDM’09,” Statistical

Analysis and Data Mining, vol. 2, no. 5-6, 291-293. SOFTWARE SYSTEMS:

Peer Reviewed Software: • Drossu, R., Obradovic, Z. and Fletcher, J. (1996) “The HDE and the BP Neural Network Simulators with a

TCL/TK Based Graphical User Interface," Peer reviewed. Co-sponsors: the IEEE Education Society, the IEEE Foundation, the National Science Foundation Division of Undergraduate Education, the IEEE Educational Activities Board, the IEEE Computer Society, the IEEE Technical Activities Board, the International Engineering Consortium, the American Society of Mechanical Engineers, and the IEEE Engineering Management Society. CD-ROM complementing the IEEE Trans. on Education special issue on Applications of Information Technology, M. Hagler (Ed.), IEEE Press, vol. 39, no. 3, August 1996, CD-ROM directory 18.

Commercial Software: • Romero, P., Obradovic, Z. and Dunker, A.K. (1999) “Protein Disorder Prediction Software," Washington

State University license transferred to Molecular Kinetics Inc.,. • Obradovic, Z. and Lazarevic A. (1999) “A Corporate Data Analyzis System for Identify the Attributes that

Drive Business Value," in use by Valueminer.com Inc. PATENT:

• Obradovic, Z., Fiez, T., Vucetic, S., Lazarevic, A., Pokrajac, D. and Hoskinson, R. “Systems and Methods for Knowledge Discovery in Spatial Data," United States Patent No. 6865582 (issued March 8, 2005).

PROFESSIONAL ACTIVITIES:

Executive Editor: • Statistical Analysis and Data Mining journal, Executive Editor for Applications, 2010 – present. Editorial Board Member: • International Journal of Computational Intelligence in Bioinformatics and Systems Biology, 2009 -Present • International Journal of Computational Models and Algorithms in Medicine, 2009 – Present. • Journal of Biomedicine and Biotechnology, 2008 – Present. • Advances in Bioinformatics, 2008 – Present • Statistical Analysis and Data Mining, 2006 – 2009 • International Journal of Parallel, Emergent and Distributed Systems, 2006 - Present • International Journal of Data Mining and Bioinformatics, 2005 – Present. • Multiple Valued Logic and Soft Computing, 1995 - Present. • IEEE Trans. on Education, 1997- 2001. • Journal of Computational Intelligence in Finance, 1995 - 1999.

Guest Editor:

Page 40: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

• Statistical Analysis and Data Mining, The Best of SDM 2009 Issue (co-edited with H. Liu) • BMC Bioinformatics, Special Issue on First International Workshop on Text Mining in Bioinformatics (co-

edited with M. Song), vol. 8, supp. 9, 2007. • Knowledge and Information Systems, Special Issue on Selected and Revised Papers from KDD-2000

Workshop on Distributed and Parallel Knowledge Discovery, vol. 3, no. 4, 2001. (co-edited with J. Ghosh, H. Kargupta and V. Kumar)

• Journal of Computational Intelligence in Finance, Special Issue on Financial News Analysis using Distributed Data Mining, vol. 7, no. 2, March 1999, (co-edited with S.H. Rubin).

• Journal of Computational Intelligence in Finance, Special Issue on Hybrid Neural Networks for Financial Forecasting, vol. 5, no. 1, January 1997.

Program Chair: • The 4th

International Workshop on Mining Multiple Information Sources, in conjunction with IEEE International Conference on Data Mining, Sydney, Australia, Dec. 2010 (Co-Chair with R. Jin, X. Zhu, H. Wang). Ninth SIAM International Conference on Data Mining,

• IEEE 2007 International Conference on Bioinformatics and Biomedicine, San Jose, CA, Nov. 2007 (Program Co-Chair with X.T. Hu and I. Mandoiu; and Steering committee member).

Atlanta, Reno, April, 2009 (Program Co-Chair with H. Liu).

• The 39th Symposium on the Interface of Statistics, Computing Science and Applications, Philadelphia, PA, May 2007 (Co-Chair with A. Izenman).

• ACM First International Workshop on Text Mining in Bioinformatics, Arlington, MD, Nov. 2006 (Co-Chair with M. Song).

• Distributed and Parallel Knowledge Discovery Workshop, The Sixth ACM SIGKDD Int'l. Conf. on Knowledge Discovery and Data Mining, Boston, August, 2000 (Co-chair).

Track Chair: • The 17th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, San Diego, California,

Aug. 2011 (•

Senior Program Committee Member).

Eleventh SIAM International Conference on Data Mining, Phoenix, Arizona, April 2011 (Senior Program Committee Member).

Tenth SIAM International Conference on Data Mining, Columbus, Ohio, April 2010 (Senior Program Committee Member). Eight SIAM International Conference on Data Mining,

• The 2007 International Conference on Artificial Intelligence, Las Vegas, NV, June 2007 (Program Vice Chair).

Atlanta, Georgia, April, 2008 (Applications Track Chair).

• The 2007 International Conference on Bioinformatics and Computational Biology, Las Vegas, NV, June 2007 (Program Vice Chair).

• The 2007 International Conference on Genetic and Evolutionary Methods, Las Vegas, NV, June 2007 (Program Vice Chair).

• The 2007 International Conference on Scientific Computing, Las Vegas, NV, June 2007 (Program Vice Chair).

• IEEE 21st International Conference on Advanced Information Networking and Applications, Niagara Falls, Canada, May 2007 (Program Vice Chair for Distributed Database and Data Mining).

• Sixth SIAM International Conference on Data Mining, Bethesda, MD, April 2006 (Bio-Medical Informatics Track Chair

).

Steering Committee Member: • IEEE 2009 International Conference on Bioinformatics and Biomedicine, Washington, D.C., November,

2009. • 2010 Conference on Intelligent Data Understanding, NASA Ames Research Center.

Page 41: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

Program Committee Member: • Second ICDM Workshop on Knowledge Discovery from Climate Data: Prediction, Extremes, and Impact,

held in conjunction with The IEEE International Conference on Data Mining (IEEE ICDM), Sydney, Australia, December, 2010.

• The 2010 ACM Second International Workshop on Data and Text Mining in Bioinformatics, in conjunction with CIKM 2010.

• The Third Conference on Intelligent Data Understanding, San Francisco Bay area, October 5-7th, 2010. • The 9th International Workshop on Data Mining in Bioinformatics, held in conjunction with The ACM

16th SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, Aug. 2010. • The 2010 International Conference on Advances in Social Networks Analysis and Mining, Odense,

Denmark, August, 2010. • The 8th International Conference on Machine Learning and Application (ICMLA 2009), Miami, Florida,

USA, December, 2009. • First ICDM Workshop on Knowledge Discovery from Climate Data: Prediction, Extremes, and Impact,

held in conjunction with The IEEE International Conference on Data Mining (IEEE ICDM), Miami, Florida, USA, December, 2009.

• The 15th

• The 7th International Conference on Machine Learning and Application (ICMLA 2008) San Diego, California, Dec., 2008.

ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Paris, France, June 2009.

• The 2008 IEEE International Conference on Bioinformatics and Biomedicine, Philadelphia, PA, Nov 7-9, 2008.

• The 2008 ACM Second International Workshop on Data and Text Mining in Bioinformatics, in conjunction with CIKM 2008, Napa Valley, California, Oct., 2008.

• 12th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Osaka, Japan, May, 2008. • Data Mining in Medicine Workshop, in conjunction with the IEEE International Conference on Data

Mining, Omaha, Nebraska, October, 2007. • 2nd

• 11th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Nanjing, China, May, 2007. IAPR Workshop on Pattern Recognition in Bioinformatics, Singapore, October, 2007.

• 2nd BioDM Workshop on Data Mining for Biomedical Applications, Nanjing, China, May, 2007. • Seventh SIAM International Conference on Data Mining, • 2007 IEEE Symposium Series on Computational Intelligence, Data Mining Symposium, Honolulu, Hawaii,

April 2007.

Minneapolis, Minnesota, April, 2007.

• 6th International Workshop on Data Mining in Bioinformatics, 12th

• 2006 IAPR Workshop Pattern Recognition in Bioinformatics, Hong Kong, Aug. 2006

ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining, Philadelphia, Aug. 2006.

• IEEE International Conference on Mechatronics and Automation, Luoyang, Henan, China, June 2006. • The 38th Symposium on the Interface of Statistics, Computing Science and Applications, Pasadena, CA,

May 2006. • 9th

• 5th IEEE Symposium on Bioinformatics and Bioengineering, Minneapolis, Minnesota, Oct. 2005. Workshop on Mining Scientific Datasets, Bethesda, Maryland, April 2006.

• IEEE Region 8 EUROCON Int’l Conf. on Computer as a Tool, Belgrade, Serbia, Nov. 2005. • 4th Int’l Conf. Computational Intelligence in Economics and Finance, special session on Forecasting

Volatility in Financial Market, Salt Lake City, Utah, July 2005. • 2005 IEEE Int’l Conf. Mechatronics and Automation, Niagara Falls, Canada, July 2005. • Fifth SIAM International Conference on Data Mining, Newport Beach, CA, April 2005. • Emerging Information Technology Conference, Princeton University, Oct. 2004. • Fourth SIAM International Conference on Data Mining, Orlando, FL, April 2004. • Bioinformatics Workshop at the Fourth SIAM International Conference on Data Mining, Orlando, FL,

April 2004. • 3rd Workshop on Bioinformatics in Data Mining (BIOKDD 2003), ACM SIGKDD International

Conference on Knowledge Discovery and Data Mining, Washington, DC, August 2003.

Page 42: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

• Third SIAM International Conference on Data Mining, San Francisco, CA, May 2003. • Sixth Workshop on Mining Scientific Dataset, San Francisco, CA, May 2003. • Second Workshop on Data Mining in Bioinformatics, The Eight ACM SIGKDD Int'l Conf. on Knowledge

Discovery and Data Mining, Edmonton, Alberta, Canada, July 2002. • Int'l Conf. on Neural Networks Applications in Electrical Engineering, Belgrade, Serbia, September 2002. • Soft Computing in Financial Markets Conference, Int'l Congress on Computational Intelligence Methods

and Applications, Rochester Institute of Technology, N.Y., June 1999. • Distributed and Parallel Data Mining Workshop, Knowledge Discovery in Databases Conference, New

York City, N.Y., August 1998. • 4th Int'l Conf. Neural Networks Applications in Electrical Engineering, Belgrade, Serbia, September 1997.

Executive Committee Member:

• Greater Philadelphia Bioinformatics Alliance (BioAdvance, The Children’s Hospital of Philadelphia, Drexel University, Fox Chase Cancer Center, Penn State, Temple University, Thomas Jefferson University, University of Pennsylvania, University of the Sciences in Philadelphia, The Wistar Institute), 2002 – Present.

Advisory Board Member:

• The Bioinformatics and Medical Informatics Graduate Program and its associated research center, San Diego State University, 2008 – Present.

• International Artificial Intelligence Knowledge Society, 2005 – Present. Advising Expert:

Bioinformatics Faculty Recruitment Committee, Faculty of Science and Technology, Uppsala University, Sweden, 2003.

Grant Proposal Review Panel Member: • The National Science Foundation, Directorate for Computer and Information Science and Engineering,

Division of Information and Intelligent Systems, 1996, 1998, 1999, 2003, 2004, 2008. • The First Int'l Nonlinear Financial Forecasting Competition, Performance Analyst Evaluating

Prediction Strategy Entries, 1996. Keynote Lectures: • “Spatio-Temporal Characterization of Aerosols through Active Use of Data from Multiple Sensors,”

Keynote Lecture at the 3rd International Workshop on Mining Multiple Information Sources, in conjunction with IEEE International Conference on Data Mining, Miami, FL, Dec. 2009.

• “Knowledge Discovery from Biological Databases for Understanding Protein Disorder,” Keynote Lecture at the 2008 IEEE International Conference on Bioinformatics and Biomedicine, Philadelphia, PA, Nov., 2008.

• “Functions of Intrinsically Disordered Proteins and Relationship with Human Disease Network ,” Keynote Lecture at 12th Serbian Mathematics Congress, Novi Sad, August, 2008.

• “Data Mining Support for Retrieval and Analysis of Geophysical Parameters,” Plenary Lecture at the 10th IASTED International Conference on Intelligent Systems and Control, Cambridge, MA, November 2007.

Other Invited Lectures: • “Structured Regression by Continuous Conditional Random Fields and Multiple Noisy Oracles,” Serbian

Academy of Sciences and Arts, Belgrade, August 2010. • “Analysis of Temporal Social Networks and Approximation of the Markov Blanket in a kernel induced space,”

Dept. of Organizational Sciences, Univ. of Belgrade, Serbia, June 2010. • “Unfoldomics of Human Genetic Diseases,” Greater Philadelphia Bioinformatics Alliance Annual Meeting,

Drexel University, PA, November 2009.

Page 43: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

• “Unfoldomics of Human Genetic Diseases,” workshop on Translational Bioinformatics: Bridging Bioinformatics and Biomedical Informatics in Translational Medicine, Conference on Innovations in Lifesciences and Healthcare, Bryan Mawr University, PA, October, 2009.

• “Computation Enabling Information Sciences: A Data Miner’s Perspective,” panel on Computation Enabling Information Sciences, Computational Engineering and Science/HPC workshop, Lehigh University, PA, October 2009.

• “Uncertainty Estimation and Selection of Sensor Sites in Remote Sensing Applications,” IEEE SCG Section and Dept. of Electrical Engineering at University of Belgrade, September, 2009.

• “Sequence Alignment and Structural Disorder: A Substitution Matrix for an Extended Alphabet,” Serbian Academy of Sciences and Arts, Belgrade, July 2009.

• “Dynamic Clustering-Based Estimation of Missing Values in Mixed Type Data,” Dept. of Organizational Sciences, Univ. of Belgrade, Serbia, June 2009.

• “Functions of Intrinsically Disordered Proteins and Relationship with Human Disease Network,” University of Minnesota, Sept. 2008.

• “Uncertainty Reduction in Gene Expression Data Analysis,” IEEE SCG Section and Dept. of Electrical Engineering at University of Belgrade, August, 2008.

• “Using Prior Knowledge to Reduce Uncertainty when Mining Microarray Data,” IBC's Chips to Hits/Discovery to Diagnostics Conference, Philadelphia, PA, September 2007.

• “Data Mining Support for Aerosol Optical Depth Retrieval and Analysis,” IEEE Section of Serbia and Monte Negro, Dept. of Electrical Engineering, Univ. of Belgrade, Serbia, Aug. 2007.

• “Data Mining Approach to Functional Characterization of Protein Disorder,” Serbian Academy of Sciences and Arts, Belgrade, Aug. 2007.

• “Data Mining Support for Retrieval and Analysis of Geophysical Parameters,” Computer Science Department, Drexel University, Feb. 2007.

• “Data Mining Approach to Functional Annotation of Protein Disorder,“ College of Information Science and Technology, Drexel University, Nov. 2006.

• “Data Mining Support for Aerosol Retrieval and Analysis,” 1st Workshop on the Assessment of Global Aerosol Product, Univ. Maryland, Sept. 2006.

• “Using Gene Ontology Graphs for Biomarkers Selection from Integrated Microarray, Proteomics and Clinical Data,” International Mathematical Conference - Topics in Mathematical Analysis and Graph Theory Conference, Belgrade, Serbia, Sept. 2006 (a satellite to International Congress of Mathematicians, Madrid, Aug. 2006).

• “Integration of Deterministic and Statistical Algorithms for Aerosol Retrieval,” 38th Symposium on the Interface of Statistics, Computing Science and Applications, Pasadena, CA, May 2006.

• “A Toolbox for Characterization of Gene Functional Expression Profiles,” Keynote lecture at Indiana Bioinformatics Conference, School of Medicine, University of Indiana, Indianapolis, May 2006.

• “Earth Science Applications of Data Mining,” Mathematics and Computer Science Dept., Saint Joseph’s Univ., March 2006.

• “Integration of Deterministic and Statistical Algorithms for Aerosol Retrieval,” Serbian Academy of Sciences and Arts, Belgrade, Sept. 2005.

• “A Toolbox for Characterization of Gene Functional Expression Profiles,” IEEE Section of Serbia and Monte Negro, Dept. of Electrical Engineering, Univ. of Belgrade, Serbia, Sept. 2005.

• “Characterization of Gene Functional Expression Profiles,“ School of Public Health, Univ. of Medicine and Dentisty New Jersey and American Statistical Association, New Jersey Chapter, April 2005.

• “Data Mining Approach to Study of Protein Disorder,“ Center for Advanced Biotechnology and Medicine, Univ. of Medicine and Dentisty New Jersey, April 2005.

• Data Fusion and Models Fusion for Efficient and Accurate Aerosol Retrieval,“ Jet Propulsion Laboratories and Caltech University, Pasadena, CA, March, 2005.

• “Data Mining for Efficient and Accurate Large Scale Retrieval of Geophysical Parameters,“ American Geophysical Union Fall Meeting, San Francisco, CA, Dec. 2004.

• “Data Mining Approach to Study of Protein Disorder,” Emerging Information Technology Conference, Princeton University, Oct. 2004.

Page 44: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

• “Characterization of Gene Functional Expression Profiles of Plasmodium Falciparum,” Greater Philadelphia Bioinformatics Alliance Retreat, Oct. 2004.

• “Learning from Large Data Streams," IEEE Section of Serbia and Monte Negro, Dept. of Electrical Engineering, Univ. of Belgrade, Serbia, Sept. 2004.

• “Understanding Functions of Disordered Proteins through Data Mining,” School of Medicine, University of Belgrade, June 2004.

• “Bioinformatics Approach to Study of Protein Disorder,” Georgetown University, May 2004. • “Predicting Intrinsic Disorder from Amino Acid Sequence,” Harvard University, Aug. 2003. • “Exploiting Unlabeled Data for Improving Accuracy of Predictive Data Mining," Rockefeller University, June

2003. • “Exploring Bias in the Protein Data Bank Using Contrast Classifiers," Uppsala University, Sweden, May 2003. • “A Distribution Based System for Discovering Interesting Knowledge in Scientific Databases,"

Inauguration Workshop for the Linnaeus Centre for Bioinformatics, Uppsala University, Sweden, November 6, 2002.

• “Protein Disorder Prediction and Function Analysis," Center for Bioinformatics, University of Pennsylvania, Oct. 18, 2002.

• “Towards Solutions to Some Challenging Open Problems in Scientific Data Mining," The Mathematical Institute, Serbian Academy of Sciences and Arts, Belgrade, Serbia, Sept. 04, 2003.

• “Efficient Mining at Large Spatial Databases," IEEE Section of Yugoslavia, Dept. of Electrical Engineering, Univ. of Belgrade, Serbia, June 24, 2003.

• “Analysis of Deregulated Electricity Markets for Trading Optimizations," Plenary lecture, Balkan Power Conference 2002, Belgrade, Serbia, June 19, 2003.

• “Understanding Protein Disorder and Their Flavors," Bioinformatics and Genome Research 2002, Beyond Genome, Information and Ideas for the Post-genomic Era, San Diego, CA, June 4, 2002.

• “Controllable Data Reduction for Efficient Data Analysis of Spatial Databases," Fifth Workshop on Mining Scientific Datasets, Second SIAM Int'l Conf. on Data Mining, Arlington, VA, April 13, 2002.

• “Supervised Clustering of Disordered Proteins," Bioinformatics 2002, April 7, 2002 Bergen, Norway, April 17, 2002.

• “Data Reduction for Spatial Data Analysis,” Computer Engineering Dept., University of South California (January 18, 2002).

• “Knowledge Discovery in Spatial and Temporal Databases," Mathematical Challenges in Scientific Data Mining, NSF Institute for Pure and Applied Mathematics, University of California Los Angeles (January 18, 2002).

• “Commonness, Complexity, Flavors and Function of Intrinsic Protein Disorder: A Bioinformatics Study," Mathematical Challenges in Scientific Data Mining, NSF Institute for Pure and Applied Mathematics, University of California Los Angeles (January 17, 2002).

Grant Proposal Reviewer: • The National Science Foundation, Advanced Computational Research Program. • The National Science Foundation, Knowledge Models and Cognitive Systems Program. • The National Science Foundation, Computational Biological Activities Program. • The National Science Foundation, Computer and Information Science and Engineering Minority Career

Advancement Awards. • The US Department of Energy, SC-32. • The US Army Research Office, Life Science Division. • Science Foundation Ireland, Information and Communications Technology Directorate

UNIVERSITY SERVICE: • University Research and Creativity Award Committee member (2010, 2011) • The Sciences Subcommittee member of the University Graduate Board (2010) • President’s Tenure and Promotion Advisory Committee member (2004 – 2005).

Page 45: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

• CST Dean Search Committee member (2004-2005). • CST College Promotion and Tenure Committee member (2003-2011). • CST Merit Committee member (2001, 2003, 2006, 2007, 2008, 2010). • CIS Executive Committee (2009, 2010) • CIS Chair of Faculty Search Committee (2001, 2002, 2007, 2008). • CIS Faculty Search Committee member (2009). • CIS Chair of Promotion and Tenure Committee (2006, 2008, 2009, 2010) • CIS Promotion and Tenure Committee member (2004). • CIS Advisor to Faculty Search Committee (2003). • CIS Chair of Research Committee (2005 – 2007). • CIS Research Committee member (2000-2005). • CIS Graduate Studies Committee member (2000-present). • EECS Graduate Program Coordinator for Computer Science (1997-2000). • EECS Undergraduate Program Coordinator for Computer Science (1992-1994). • EECS Personnel and Policy Committee elected member (1997-present). • EECS Task Force on Graduate Studies Policies member (1996-present). • CS Faculty Search committee chair (1999). • EECS Director Search committee member (1999). • CS Promotion and Tenure Committee chair (1999). • CS Third Year Evaluation Committee member (1999). • CS Faculty Search Committee member (1997). • 20 PhD and more than 30 M.S. thesis committees member (1991-present). • Interdisciplinary graduate students advising and committee service in Economics,

Management and Systems, Biochemistry, Chemistry, Crop and Soil Sciences, Psychology, Education, Electrical Engineering and Computer Science (1991-present).

• CS colloquium coordination (1991-1996). • CS Graduate Students Admission committee member (1994-1996). • Software Engineering technical committee member (1994-1996). • Computer Engineering Curriculum Development committee member (1991-1994). • Software Engineering Curriculum Development committee member (1991-1993). • Algorithmics Curriculum Development committee member (1991-1994). • Artificial Intelligence Curriculum Development committee member (1991-1994). • Undergraduate Studies committee member (1991-1994). • Boeing Chairman Professorship Faculty Search committee member (1991). TEACHING: Consistently receiving excellent student evaluations (significantly better than for department, college and university at all criteria) • Machine Learning (graduate course, taught in 2005, 2006) • Knowledge Discovery and Data Mining (graduate course, taught in 1998, 2003, 2004, 2008, 2010). • Data Warehousing, Filtering and Data Mining (graduate course, taught in 2001). • Neural Computation (graduate course, developed and taught in 1992, 1993, 1994, 1996 and 1999, 2001, 2006,

2009, 2011). • Parallel Computation (new graduate course, developed and taught in 1992, 1993 and 1996). • Artificial Intelligence (graduate course, taught in 1992 and 1993). • Algorithmics (graduate course, developed and taught in 1995, 1996, 1997, 1998). • Design and Analysis of Algorithms (undergraduate course, taught in 1994). • Automata and Formal Languages (undergraduate course, taught in 1994 and 1997). • Introduction to Artificial Intelligence (undergraduate course, taught in 1991 and 1995).

Page 46: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

• Neural Network Design and Application (new undergraduate course, developed and taught in 1997). STUDENTS ADVISING:

Postdoctoral Associates and Visiting Scholars: Research: Bioinformatics

• Xiaohong Li (1999-2001) • Slobodan Vucetic (2001-2002) • Junping Wang (1999-2001) • Lining Yu (2006-present) • Vadim Ayuyev (2007-2008) • Zhongmei Shu (2008-present) • Dairong Wang (2009-present)

Current Ph.D. Students:

• Debashis Das - Research: Spatial and Temporal Data Mining

• Mohamed Ghalwash - Research: Bioinformatics • Solomon Jones

- Research: Health Informatics • Joseph Jupin

- Research: Data Fusion • Qiang Lou

- Research: Spatial and Temporal Data Mining • George Mathew

- Research: Health Informatics • Uros Midic

- Research: Bioinformatics • Zhang Ping

-Research: Bioinformatics • Yilian Qin

- Research: Spatial and Temporal Data Mining • Vladimir Ouzienko

-Research: Analysis of Social Science Data • Dusan Ramljak

-Research: Bioinformatics. • Vladan Radosavljevic

- Research: Spatial and Temporal Data Mining • Kosta Ristovski

- Research: Spatial and Temporal Data Mining • Alexey Uversky

-Research: Medical Informatics. Graduated Ph.D. Students:

• Qifang Xu - Dissertation: “Statistical Analysis of Biological Interactions of Homologous Proteins,” Computer and Information Science Ph.D., Temple University, Fall 2008.

- First Ph.D. position: Research Associate, Fox Chase Cancer Institute, Philadelphia • Michael Hongbo Xie - Dissertation:”Functional Characterization of Large Scale Biological Data,” Computer and

Information Science Ph.D., Temple University, Summer 2007.

Page 47: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

- First Ph.D. position: Bioinformatics Specialist, IDF Research Department of Children's Hospital of Philadelphia

• Bo Han - Dissertation: “Knowledge Discovery by Fusion of Information,” Computer and Information Science Ph.D., Temple University, Summer 2007.

- First Ph.D. position: • Kang Peng

- Dissertation: “Learning from Protein Structure Related Data,” Computer and Information Science Ph.D., Temple University, Spring 2006. First Ph.D. position: Research Associate, School of Informatics, Indiana University, Bloomington.

• Predrag Radivojac - Dissertation: “Classification and Knowledge Discovery in Protein Databases," Computer and

Information Science Ph.D., Temple University, Fall 2003. - First Ph.D. position: Research Associate, School of Medicine, Indiana University, Indianapolis.

• Dragoljub Pokrajac - Dissertation: “Knowledge Discovery in Spatial-Temporal Databases," Computer and

Information Science Ph.D., Temple University, Summer 2002. - First Ph.D. position: Assistant Professor, Computer Science Dept., Delaware State University.

• Aleksandar Lazarevic - Dissertation: “Distributed Inductive Learning for Time/Space Data Analysis," Computer

and Information Science Ph.D., Temple University, Fall 2001. - First Ph.D. position: Research Associate, Army High Performance Computing Research

Center, Computer Science Dept., University of Minnesota. • Slobodan Vucetic

- Dissertation: “On-line Systems for Non-stationary Data Analysis and Modeling, ", Electrical Engineering Ph.D., WSU, Summer 2001. - First Ph.D. position: Visiting Assistant Professor, Center for Information Sciences and technology and Computer and Information Sciences Department, Temple University, Philadelphia, PA.

• Pedro Romero - Dissertation: “Knowledge Discovery and Data Mining in Protein Databases," Computer Science Ph.D., WSU, Spring 1999. - First Ph.D. position: Research Scientist, Artificial Intelligence Laboratories, Stanford Research Institute International, Menlo Park, CA.

• Radu Drossu - Dissertation: “Efficient Design of Neural Networks for Time Series Prediction," Computer Science Ph.D., WSU, Summer 1997. - First Ph.D. position: Staff Scientist, Financial Engineering Group, HNC Software Inc., San Diego, CA.

• Tim Chenoweth - Dissertation: “A Neural Network Based System for Predicting Future Returns for the S&P 500 Stock Index," Interdisciplinary Ph.D., WSU, Summer 1996. - First Ph.D. position: Assistant Professor (tenure track), School of Accountancy, Arizona State University.

• Srdjan Milenkovic - Dissertation: “Higher-Order Dynamic Learning Through Nondeterministic Global Optimization," Electrical Eng. Ph.D., University of Nis, Yugoslavia, co-advised with Prof. Vanco Litovski from Univ. of Nis, Summer 1996. - First Ph.D. position: Research Scientist, Microelectronics Centre, Middlesex University, London, United Kingdom.

• Justin Fletcher - Dissertation: “A Constructive Approach to Hybrid Architectures for Machine Learning," Computer Science Ph.D., WSU, Summer 1994. - First Ph.D. position: Principal Engineer, Itron Corp., Spokane, WA.

Page 48: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

Graduated M.S. Students: • Stephen Muchmore

- Thesis: “Combining Point Level and Aggregated Tract Level Data to Improve Clustering of Adolescent Crime Data in Philadelphia,” Computer Science, M.S., Fall 2007.

• Yilian Qin - Thesis: “Support Vector Machine Reuse for Large Spatio-Temporal Datasets," Computer Science M.S., Spring 2005. - Continued towards a computer science Ph.D. at Temple University

• Tim Chenoweth - Project: “Learning Algorithms for Trading Systems Based on Biased Estimators," Computer Science M.S., Spring 1996. - Continued towards an Interdisciplinary Ph.D. at WSU.

• Srikumar Rangarajan - Thesis: “Design of Application-Tailored Neural Networks Using Genetic Algorithms," Computer Science M.S., Summer 1993. - First M.S. job at the Microsoft Inc., Redmond, WA.

• Anthony Kampka - Thesis: “A Stochastic Technique in Constructive Training of Artificial Neural Networks," Computer Science M.S. Fall 1992. - First M.S. job at the Exabyte Corp., Boulder, CO.

• Shailesh Vaishnavi - Project: “Storage Organization for Multiattribute Retrieval in CAD Databases," Computer Science M.S., Summer 1992. - First M.S. job at the Amdahl Corp., Sunnyvale, CA.

Undergraduate Students Advising:

• Bobby Parchuri - Penn State U. student trained in bioinformatics research in my laboratory (2006) • Mathew Fenty - Trained in bioinformatics research through a hands-on project in my laboratory (2005) • Josh Crean and Josh Hartwel - Research assistants on my spatial-temporal data analysis project (2004). • Timothy O'Connor

- Washington State U. student trained on my bioinformatics project with A.K Dunker (2001-2003). - Visiting scholar at my lab at Temple U. (Goldwater fellowship, 2003). - Accepted to graduate schools at Harvard, Univ. Washington and Princeton. Starts at Princeton Univ., Fall 2005. Started Ph.D. studies at Princeton U. Fall 2005.

• Ethan Garner - Research assistants on my project with A.K. Dunker (1997-2000). - Published 4 joint papers on our bioinformatics project. - Accepted to graduate school at Harvard, Stanford, Scripps, Wisconsin, UC Berkeley and UC San Francisco. Starting at UC San Francisco Fall 1999. - Elected to remain at our lab as a technician until Fall 2000 in order to publish several more papers.

• Radmila Sarac - Awarded a Howard Hughes fellowship to work on my bioinformatics research project (1997-1998).

• James Jungbauer - Trained and partially funded in my lab (1994).

• Chris Allison and David Palmer - Research assistants on my neural networks project (1992-1993). - Published a joint conference paper with me.

• Certified Undergraduate CS Program Advising (1992 -1994) (Advising all certified CS students, on average 53 students per year).

Page 49: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

• Uncertified CS Students Academic Advising (1991 - 1997) (Advising on average 15 undergraduate students per year).

Other Research Service:

• Mentored twelve high school students on research projects in my lab (1994, 1996). • Mentored four high school teachers on summer research projects in my lab (1995, 1997). • Served as External evaluator/committee member for several Ph.D. theses in Asia and Europe (Univ.

of Ottawa, National Univ. of Singapore, Univ. of Belgrade, Univ. of Novi Sad, Univ. of Nis). AWARDS/RECOGNITION:

• H-index 39 according to Harzing's Publish or Perish (as of Aug. 2010) • Cited more than 6,200 times according to Harzing's Publish or Perish (as of Aug. 2010) • Author of the 3rd most cited article of all time across all volumes published by the Biochemistry

journal (as of Aug. 2010). The top 20 list is available as "Most Cited Papers, All Time" at the Biochemistry web site with more details at CrossRef's Linking service). The Biochemistry journal impact factor is >5 and it exists for > 100 years currently printing 8 volumes per year with 24 issues per volume.

• Temple University Faculty Research Award, April 2009. • College of Science and Technology Faculty Research Excellence Award, Nov. 2008. • Team leader for the best rated model of intrinsically disordered protein regions at the seventh critical

assessments of structure prediction experiments (CASP 7), Nov. 2006. • Team leader for the best predictor in protein disorder category at the sixth critical assessments of structure

prediction experiments (CASP 6), Nov. 2004. • Team leader for the best predictor in protein disorder category at the fifth critical assessments of structure

prediction experiments (CASP 5), Nov. 2002. • Researcher of the Year, College of Engineering and Architecture, Annual Convocation,

Washington State University, April 2000. PERSONAL: USA citizen.

Page 50: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

Supplement 2

Page 51: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

Resume/Bio Sketch

David R. Schwartz, MSW

CEO/President, Q-linx, Inc. EDUCATION/TRAINING

The University of Michigan, Ann Arbor BA 1994 Kinesiology The University of Pennsylvania MSW 1997 Social Work Macro Practice

Summary

Facilitates meetings with IBM sales teams, marketing teams, technical teams and their

customers in the U.S and globally for technology training, database development/analysis consulting, and sales purposes. Works collaboratively with IBM teams before, during, and after sales meetings, requiring frequent public speaking, presentations, meeting facilitation, executive level customer interaction, training needs assessments, and training.

Recognized as the first to train and test neural network technology (a powerful data

mining and pattern recognition technique) utilizing a nationally representative dataset with the purpose of augmenting decision-making and training in child welfare/human services organizations. With extensive multi-disciplinary collaboration, the novel technology (now sold by IBM to government agencies globally) has reached approximately 90% accuracy.

Dedicated to developing and supporting creative evidence-based technology solutions.

Extensive experience initiating and managing multi-disciplinary human services

technology development projects collaborating with professors from the University of Pennsylvania’s School of Engineering and Applied Sciences and School of Social Policy and Practice, several Temple University schools and colleges, the University of Michigan, and professional sales and consulting teams at IBM.

Founder and CEO/President 1998-Present Q-linx, Inc. Responsible for management of risk assessment/decision-support technology development

and consulting company to augment training and decision-making in the fields of education, health care, criminal justice, and social welfare.

Created risk assessment technology using computational intelligence techniques to aid worker decision-making and augment training.

Leads and manages several multi-disciplinary projects requiring sales/training of customers in addition to managing teams of computer science, engineering, and social science experts. For example, the New York’s Office of Children and Family Services data mining project with IBM/Q-linx.

Developed Q-linx, Inc.’s global partnership with IBM Global Social Services (1998-Present). Trains IBM sales teams in the U.S. and internationally on Q-linx risk assessment technology

for sales support purposes. Presents risk assessment technology alongside IBM global and local sales teams to

customers in the U.S., Canada, Australia, Japan, Israel, etc., facilitating question and answer sessions with their customers.

Oversees the development of grants and innovative technology-focused services in the child welfare, education, and healthcare fields. For example, presented innovative neural network training/risk assessment portion of a large U.S. Department of Education training grant at the national transition to teaching conference.

Consults on risk assessment/training-focused software solutions for for-profit, nonprofit and government organizations nationally and internationally.

Page 52: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

Director of Development 1997-1998 Child, Inc., Wilmington, DE Overall responsibility for planning and development activities for a comprehensive child and

family services agency. Co-wrote a winning proposal for Violence Against Women Act funding. A shelter was

constructed for victims of domestic violence, with on-site treatment services focused on the prevention of future domestic violence with support, therapeutic, and training services on-site.

Case Manager 1994-1995 The Choice Program, Baltimore, MD Part of a team managing after school training programs for at-risk middle school students. Worked in Prince George’s County, Maryland office in a school-based juvenile justice

prevention and alternative to secure detention program. Violent and nonviolent youth offenders were admitted to the program, made daily school and

home visits, facilitated group sessions with clients, and advocated for the youth in the community.

Awards

Wharton Business School Journal Award, QLINX.com, Most Socially Responsible Business

Plan, 2000. Research Publications

Schwartz, D. R., Kaufman, A. B., & Schwartz, I. M. (2004). Computational intelligence techniques for risk assessment and decision support. Children and Youth Services Review, 26, 1081-1095

Jones, P. R., Schwartz, D. R., Schwartz, I. M., Obradovic, Z., & Jupin, J. (2006). Risk classification and juvenile dispositions: What is the state of the art? Temple Law Review, 79, 2, 461-498

Schwartz, I. M., Jones, P. R., & Schwartz, D. R. Improving Social Work Through the Use of Technology and Research, Child Welfare Research, edited by Duncan Lindsey and Aron Shlonsky. (2008). Oxford University Press: New York.

Selected Presentations

Numerous presentations and papers on augmenting training, risk assessment, and decision-making with technology have been delivered:

Facilitated numerous IBM master class presentations, sales team briefings/training, and customer education/training/program presentations at several APHSA IT Solutions Management for Human Services Conferences.

Risk Assessment Models and Empirical Validity: Making Life and Death Decisions, Panel Discussion, Conference Faculty/Panel Presenter. One Child Many Hands Conference, a Multi-Disciplinary Conference on Child Welfare, University of Pennsylvania. (2009).

Risk Assessment in Juvenile Justice: Identifying Best Practice. Upcoming presentation (2/24/2010) at the annual meeting of the American Society of Criminology (with Peter R. Jones and Ira M. Schwartz).

Page 53: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

Risk Classification in Juvenile Justice and Child Welfare: The Dangers of Overconfidence. Paper presented at the annual meeting of the American Society of Criminology (with Peter R. Jones and Ira M. Schwartz). (2008).

The International Society for the Prevention of Child Abuse and Neglect, as accepted presenter and invited session chair/facilitator (Berlin, Germany, Warsaw, Poland, Lisbon, Portugal).

Temple University Medical School, Grand Rounds. National Symposium on Child Sexual Abuse.

Competencies

Data mining with large and small databases, data warehouse development/restructuring, artificial neural networks, fuzzy logic, pattern recognition/risk assessment with large social, health, human services, and criminal datasets, client meeting facilitation, consulting with executive clients, client sales presentations, client/sales team training, frontline worker MIS training and needs assessment, presentation development, project management, E-discovery technologies, multi-disciplinary team management, IT and business expert collaboration.

Contact Information

David R. Schwartz 128 Union Avenue Bala Cynwyd, PA 19004

[email protected] Phone: 610.733.7140

Page 54: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

Supplement 3

Page 55: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

Type of Document

Name of Document Bates Range, if applicable

Data, mdb files, and Databases

684v2000.mdb

n/a

Database.mdb

n/a

Resource.mdb

n/a

Staff.mdb

n/a

YI678 Database (PR800I.dbf; PRdata.dbf; PRhoh.dbf; RscChar.dbf; RscHHMmd.dbf; Rscrdata.dbf; rsrcbkck.dbf; rscrtrng.dbf)

n/a

YI684 Database (monitor.accdb; monitor.dbf; YI602.dbf; yi684bl2.dbf; YI684CL.dbf; YI684DL.dbf; YI684EL.dbf; yi684pl.dbf; YI701MBL.dbf)

n/a

YI701 Database (monitor.dbf; YI701MBL.dbf; yi701mcl.dbf; yi701mdl.dbf; yi701mel.dbf)

n/a

YI684 Data from 3/30/2010 Data Run

n/a

WebFOCUS Source Code for Access

CodeforAccess&KAHDS-00001-381

Gelona’s List of Resource Type Short Names

MGrissom-Gelona List-00001-00002

How To Link Access Data Tables

MGrissom-How to Link-00001-00003

YI684 Data 3.30.10 (Labels.htm; monitor.dbf; YI602.dbf; yi684bl2.dbf; YI684CL.dbf; YI684DL.dbf; YI684EL.dbf; YI684L.dbf; yi684pl.dbf; YI701MBL.dbf; yi701mcl.dbf; yi701mdl.dbf; yi701mel.dbf)

n/a

Page 56: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

Type of Document

Name of Document Bates Range, if applicable

Query Lists

List of Non-Access Database Queries

Data C&R-4-00001-13

List of YI678 Queries

YI678-00001-4

List of YI684 Queries

YI684-00001-7

YI678 Queries with Descriptions

Data C&R-9-00001-15

YI684 Queries with Descriptions

Data C&R-11-00001-44

Current List of Non-Access Queries (incomplete)

KIDSRptList-11.1.10-00001-20

Current List of Non-Access Queries (complete)

KIDSRptList-11.1.10-00001-00044

List of YI701 Queries Listof701Queries-00001-6

Third List of Queries – Data C&R-4-00001-13 Contained in Access Databases

MGrissom-List of Queries-00001-00003

Emails with attachments

Emails re: Access database use WhiteA-004002, WhiteA-004003-4007, WhiteA-016446, WhiteA-016447

Emails re: issues with KIDS reports

Issuesw-AccessComm-00001-00096

Emails re: KIDS reports task force

Survey-Taskforce-00001-00005

Deposition Transcripts and Exhibits

Deposition transcript and exhibits of Mary Grissom, 10/1/2008

n/a

Deposition transcript and exhibits of Mary Grissom, 8/5/2010

n/a

Deposition transcript and exhibits of Mary Grissom, 9/7/2010

n/a

Deposition transcript and exhibits of John Gelona, 9/23/2010

n/a

Deposition transcript and exhibits of Jin Jew, 11/9/2010

n/a

Deposition transcript and exhibits of Nancy Elizabeth Roberts, 11/9/2010

n/a

Deposition transcript and exhibits of J.G. Nair, 12/1/2010

n/a

Page 57: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

Type of Document

Name of Document Bates Range, if applicable

Standalone documents

Document entitled "Problems with YI 684 Queries in Litigation Discovery"

n/a

Children’s Safety Initiative: Oklahoma’s CW Practice Model Implementation and Training Plan

CWPMSC-4.2010-00001-13

Resume of Jayaprakash Nair Resumes-00003-7 Instructions for seeing the SQL source code generated by the ACCSES GUI previously supplied

AccessDBQueries-SQLInst-00001

Meeting minutes from CFSD Administrative Staff Meeting, 8/9/2010

LimitQueriesComm-00001-00002

KIDS Version Notes Data C&R-3-00001-450 KIDS Screen Fields to Data Elements

Data C&R-1-00001-72

Foster Care AFCARS Elements Data C&R-2-00001-29 KIDS Application Guide n/a KIDS Picklist Values KIDS Picklist Values-00001-267 Current Version of the KIDS Reports Page

KIDSRptPage-11.1.10-00001

Web FOCUS Headers WebFOCUSHeaders-00001-30 YI678 Data File Header Description Data C&R-12-00001-10 YI684 Data File Header Description Data C&R-10-00001-15 OKDHS Data Dictionary OKDHS Data Dictionary KIDS-

00001-02662 KIDS User Manual KIDS Manual-00001-00391

Page 58: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

Supplement 4

Page 59: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

Errors found in the 2009 and 2004 KIDS Version Notes The following are specific examples of errors that were found in the 2009 and 2004 KIDS version notes:

2009 KIDS Version Notes • Less than optimal Child Death Screen functionality (Data C&R-3-00438-39). • Error in Case Review functionality, error message when making multiple changes

(Data C&R-3-00446-47). • Disappearing Foster Care Claims, documentation issues/errors (Data C&R-3-00447). • Error in Financial Management – February Claims, not handling dates appropriately

(Data C&R-3-00447). • Correction needed in Client Information functionality for Case Connect Rollbacks

(Data C&R-3-00447). • "Difficulty" with Investigation Close – Out of Home functionality (Data C&R-3-00447). • Error in Private/Tribal functionality (Data C&R-3-00448). • Private/Tribal Adoption Case error, date acceptance problem (Data C&R-3-00448). • Error in the AWOL Warrant Information function, missing information after saving it

(Data C&R-3-00448). • Error in Reports-Resource Contacts function, time frame issues/errors (Data C&R-3-

00449). 2004 KIDS Version Notes

• Errors are generated when documenting the Investigation Interview date, populating at the year 1900 if certain information is added (Data C&R-3-00017).

• Errors are generated when working with the Individual Service Plans, copying/populating over wrong dates (Data C&R-3-00017).

• Database errors in the Parental Rights Fast Add function (Data C&R-3-00018). • Error in the Pre-Resource Report (Data C&R-3-00018). • Error in CWS-KIDS-1 Referral Information Report; not printing out injury specifics

properly (Data C&R-3-00018). • Error in the CWS-KIDS-25 Progress Reports, not populating correct fields when

printing Progress Reports (Data C&R-3-00019). • Error in the Case Review Manual Assignment function (Data C&R-3-00020). • Error in the On Call-Organization function, did not allow for times; only dates could be

added (Data C&R-3-00034). • Error because workers can select bad labels; Out of State or Tribal Jurisdiction not

valid (Data C&R-3-00035). • Error in the OCS Referral Screen; not populating names correctly (Data C&R-3-

00036). • Error in a critical Referral Information Report, CWS-KIDS-1; not numbering sections

correctly when user would print (Data C&R-3-00036). • Error in the Visitation Episode screen (Data C&R-3-00036). • Error in the Investigation Assessment/Investigation Close Date function (Data C&R-3-

00036-37). • Error in the Adoption Zip Code field (Data C&R-3-00046).

Page 60: Zoran Obradovic - Children's Rights · 2011-03-15  · 1 In simple terms, a “database schema”is the layout of a database or the blueprint that outlines the way data is organized.

• Error in the Court Hearing function; populating wrong information (Data C&R-3-00046).

• Bad data populating in Reports (Data C&R-3-00047). • Error in the Mental Health Commitment functionality, unable to record information

properly (Data C&R-3-00047). • Errors in AFCARS goals functionality; data not pulling to reports (Data C&R-3-00048). • Error in Child's Needs Assessment, past assessments not read-only, workers could

manipulate past assessments (Data C&R-3-00048). • Errors (“item does not pass validation test”) are generated when documenting

important information about children's Treatment Plans (Data C&R-3-00050-51).