Zoran Obradovic - Children's Rights · 2011-03-15 · 1 In simple terms, a “database schema”is...
Transcript of Zoran Obradovic - Children's Rights · 2011-03-15 · 1 In simple terms, a “database schema”is...
Report on the KIDS System: Review and Analysis
Zoran Obradovic
March 15, 2011
2
1. Executive Summary
I was asked by the Plaintiffs in the federal class action D.G. v. Henry to analyze the
KIDS System, the statewide automated child welfare information system of the Oklahoma
Department of Human Services (“DHS”), and the processes used by DHS to generate reports
from the KIDS System. I am currently the Director of the Center for Information Science
and Technology at Temple University and a Professor of Computer and Information
Sciences at Temple University. I also hold a Ph.D. in computer science, have expertise that
includes data mining and data management and have over 200 publications.
The KIDS System lacks structured tracking and testing processes. These processes
are crucial to any properly functioning data management system. DHS’s failure to
implement or effectively manage these processes has led to serious data quality problems
which have negatively – and severely – impacted the child welfare reports that are based
on data from the KIDS System. There is a significant likelihood that every child welfare
report contains inaccurate, unreliable and/or outdated information. Every child welfare
worker, supervisor and manager who utilizes these reports is potentially utilizing an
erroneous report, thus putting in harm’s way the very children they are responsible for
serving.
The primary problems with the child welfare data management system in use at
DHS are summarized in this section.
1. Management Problems and Organizational Issues. The personnel of the Technology and Governance Unit (“TGU”) have primary responsibility for the KIDS System and child welfare reporting. They are supported by a small group of reports programmers from the Data Services Division (“DSD”), who are “co-located” with TGU. The TGU personnel have little or no background in computer programming and are under-qualified for their positions. The TGU personnel write the queries for a key set of reports – Access reports – despite their lack of qualifications to do so correctly. They also have ultimate responsibility for a second set of reports – WebFOCUS reports – the queries for which are written by the reports programmers. TGU does not monitor who accesses the child welfare reports and allows inaccurate and unused reports to continue to be produced. The former manager of the reports programmers had a very hands-off management style and an inadequate background in the programming languages used by his direct reports. He allowed job partitioning and heavy specialization, both poor management practices. DHS also relies much too heavily on co-location to ensure adequate coordination between TGU and the DSD reports programmers. In practice, these two groups are not collaborating well due to a lack of formal regular communication. These poor management practices and organizational issues enable a number of serious data management problems.
3
2. Lack of Adequate Change Control. The KIDS System is constantly changing, which affects all of the programs and applications that interact with the KIDS System, including the computer programs that underlie the child welfare reports. Changes to database schemas1 like the KIDS System often require corresponding changes to the applications the database supports. Therefore, it is critical to track and manage database schemas properly. The KIDS System and the child welfare reports require (a) a source code version control system, which records all changes made to the source code and allows for a way to return to previous versions when an error is found; and (b) database change management software, which tracks changes to the database, applies those changes to all affected programs and applications that interact with the database and allows for a way to return to previous versions when an error is found. Instead, revisions to the programming code for the child welfare reports are tracked informally using “comments” written in the code. Further, DHS relies on co-location and informal communication between TGU and DSD to ensure that all relevant changes to the KIDS System are applied to the computer programs underlying the child welfare reports. The processes used by DHS are totally inadequate and lead to a high risk of unreliable and outdated child welfare reports.
3. Lack of Adequate Quality Control. The KIDS System does not have adequate quality control to ensure that the child welfare reports are accurate. DHS does not utilize a standard and formal protocol for evaluating whether software is built according to the specifications (called “verification”) and whether the software is what the end user needs (called “validation”). The protocol should include rigorous testing of the software by the programmers who develop the programs (called “white box testing”) as well as rigorous evaluation by people who are not involved in the software development (called “black box testing”). Instead, DHS relies on face validity testing that assesses, without reference to any defined standards, whether the reports “look like they would work.” These practices are insufficient and lead to a significant risk that the child welfare reports are inaccurate and unreliable.
4. The Child Welfare Reports Are Wrong. Serious errors have already been identified in some of the Access reports. The lack of adequate change control and quality control suggest that these problems are likely more widespread, and that similar errors exist in other Access and WebFOCUS reports. In my opinion, the erroneous reports are numerous and use of these reports by child welfare workers, supervisors and managers is harmful to the children who rely on those workers.
Specific documents and facts that formed the basis for these opinions are discussed in the
following sections.
1 In simple terms, a “database schema” is the layout of a database or the blueprint that outlines the way data is organized. The schema defines the tables, fields, relationships, views, functions, queries and other elements of a database.
4
2. Qualifications, Scope of Review and Analysis
Children’s Rights hired me to serve as a testifying expert in the D.G. v. Henry
litigation to opine on the KIDS System and the reports generated from the KIDS System. I
am well-qualified to provide an opinion on these subjects. I received a Ph.D. in computer
science from Pennsylvania State University in 1991 and I am currently the Director of the
Center for Information Science and Technology at Temple University in Philadelphia and a
Professor of Computer and Information Sciences at Temple University. My research and
teaching interests include data mining, data management, databases and algorithms. I have
received numerous grants, including grants from the National Institute of Health and the
National Science Foundation. I have published over 200 articles, book chapters and
refereed conference articles on topics ranging from biomedical informatics to data mining
to knowledge systems. I have also been an editor for a number of journals in my field and
have served as a chair, committee member and given lectures at many conferences in my
field. A full description of my qualifications, including a list of my publications, is provided
in Supplement 1. David Schwartz, who has an M.S.W., provided research assistance for this
project. His resume is provided in Supplement 2.
My report is based upon an analysis of the documents, files and deposition
transcripts, listed in Supplement 3, provided to me by Children’s Rights. To prepare this
report, I analyzed the provided materials from December 2, 2010 to January 31, 2011.
During the past four years I have not testified as an expert witness at a trial or by
deposition. My total compensation for the preparation of this expert report, including the
amount paid to Mr. Schwartz, was $42,350. I will be paid $400/hour for preparing for and
attending deposition testimony and $450/hour for preparing for and attending trial
testimony.
3. The Architecture of the KIDS System and the Generation of Child Welfare Reports
The KIDS System is Oklahoma’s statewide automated child welfare information
system.2 It is used by DHS personnel statewide for managing cases and documenting
casework.3 Information about individual cases is input into the KIDS System by DHS
personnel. The KIDS System is updated in real time.4 So, for example, if a worker in
Oklahoma City enters information about a child into the KIDS System, a worker in Tulsa
will immediately be able to see that information by accessing the KIDS System.
2 Grissom (10/1/08), 20. 3 Grissom (10/1/08), 21, 101; Grissom (9/7/10), 54. 4 Grissom (9/7/10), 12.
5
The data in the KIDS System resides in an Oracle object-relational database.5 The
KIDS System is maintained by two groups of programmers. First, there is a group of
PowerBuilder programmers who are responsible for maintaining and updating the “front
end” user interface of the KIDS System, i.e., the screens that child welfare personnel see
when they utilize the KIDS System. These programmers use the PowerBuilder
programming language.6 Second, there is a group of database administrators who are
responsible for maintaining and updating the “back end” of the KIDS System, i.e., the
underlying database structure. The database administrators use the SQL query language.7
In order to create reports from the data maintained in the KIDS System, the data
must be “extracted.”8 Data extracts are constructed from the Oracle database using
programs developed in the WebFOCUS environment.9 Most of these WebFOCUS extracts
are produced weekly, but some are also produced daily and some on demand.10 Unlike the
KIDS System itself, these data extracts are not updated in real time. Instead, they are
frozen in time at the moment they are created and are only updated when a new data
extract is produced.11
All child welfare reports are generated from these WebFOCUS extracts, either
directly or indirectly. Two different platforms are used for generating child welfare
reports.12 First, some reports are constructed in the WebFOCUS environment by a group of
three DSD WebFOCUS programmers.13 Throughout this report these programmers will be
referred to as the “DSD reports programmers.” The WebFOCUS environment allows the
DSD reports programmers to use a single template for rapid construction of many custom-
built reports by selecting various columns, criteria and output formats. These reports are
called the “WebFOCUS reports.” In addition to using WebFOCUS reports for internal state
reporting,14 DHS also uses WebFOCUS reports to report data to the federal government –
both AFCARS (adoption and foster care data) and NCANDS (child abuse and neglect data).15
The use of the WebFOCUS environment requires knowledge of WebFOCUS
programming, and since the remaining personnel supporting the KIDS System have no
5 Grissom (9/7/10), 9. 6 Nair, 39, 48-51; Jew, 34. 7 Nair, 48, 67-69. 8 Gelona, 27-29. 9 Gelona, 16, 100, 104-106, 172. 10 Gelona, 20-21; Grissom (9/7/10), 30. 11 Grissom (9/7/10), 56, 77-78. 12 Gelona, 20, 24-26. 13 Gelona, 217. 14 Grissom (9/7/10), 36. 15 Gelona, 181.
6
training in this programming language, a different method is used for the remaining
reports. The DSD reports programmers create three large WebFOCUS extracts – one with
data on permanency planning (YI684), one with data on resources (YI678) and one with
data on staffing (YI701) – and these extracts are converted into tables and loaded into an
Access database.16 Access is a more familiar environment17 and does not require specific
training on WebFOCUS. Members of the reporting group within TGU are responsible for
writing queries in Access. 18 Throughout this report the members of the TGU reports group
will be referred to as the “TGU reports group.” Once written, the Access queries are
executed regularly and reports are generated on a weekly basis.19 These reports are called
the “Access reports.”
Every person who has access to the KIDS System is also given access to all of the
Access reports and the large majority of the WebFOCUS reports.20 Access to these reports
is provided in a variety of ways, including through DHS’s intranet.21
4. Management Problems and Organizational Issues
As discussed above, there are two groups within DHS who are responsible for
generating child welfare reports: the DSD reports programmers and the TGU reports
group. There are significant management and organizational issues with the way these two
groups are operated and the way in which they interact with each other.
The responsibilities of TGU include the help desk, testing, functional design and
requirements gathering for the KIDS System.22 The Programs Administrator of TGU is
Mary Grissom and the Assistant Administrator is Carol Clabo. Five program managers
report to Ms. Grissom: Elizabeth Roberts, who is responsible for federal reporting,
Patricia Frye, who is responsible for state reporting, Vickie Streber, who is the functional
advisor, Stacey Bates, who is responsible for KIDS testing and Kellie Mullen, who is
responsible for KIDS training.23 Ms. Roberts and Ms. Frye are the leaders of the TGU
reports group. No one in TGU has any background in computer science or computer
programming.24
16 Gelona, 94-95, 101-111. 17 Gelona, 158-159. 18 Gelona, 20, 148; Grissom (9/7/10), 44. 19 Grissom (9/7/10), 44; Gelona 110-111, 150. 20 Gelona, 117; Grissom (10/1/08), 27-28. 21 Gelona, 20, 66-67, 115-116. 22 Grissom (9/7/10), 14-15. 23 Roberts, 34-35; Grissom (8/5/10), 6. 24 Grissom (9/7/10), 17-18, 44-45; Roberts, 15-18, 46-47; Jew, 40-42.
7
The DSD reports programmers are co-located (i.e., physically located in the same
space) as TGU.25 They consider themselves to be service providers to the TGU reports
group;26 ultimate responsibility for the child welfare reports lies with TGU.27
Currently there are three DSD reports programmers, John Gelona (the team
leader), Jin Jew and John Vernon.28 These three reports programmers are responsible for
writing, maintaining and updating the computer programming underlying hundreds of
child welfare reports accessible to approximately 2,000 users of the KIDS System.29 Until
November 2010, the DSD reports programmers were supervised by J.G. Nair.30
Ms. Grissom’s management of TGU is deficient for a number of reasons:
Ms. Grissom completely lacks a background in computer science and computer programming,31 which makes it impossible for her to personally evaluate the Access queries produced by her group or the WebFOCUS reports produced by the DSD reports programmers. For example, in her testimony Ms. Grissom discussed an error discovered by Mr. Gelona that apparently resulted from the use of lower case text rather than upper case text by a member of TGU who wrote a query in Access. Because of her lack of experience with Access, she did not know whether Access required upper case, let alone have the ability to check this query herself.32
Ms. Grissom’s lack of experience is compounded by the fact that no one within TGU has any experience in computer science or computer programming. For example, Ms. Roberts, who is responsible for all federal reporting and adoption and post-adoption reporting,33 has had no training in computer programming,34 which is needed to develop and test the queries used to create child welfare reports. Simply put, the “Technology and Governance Unit” has no one with a background in technology.
Ms. Grissom also does not pay sufficient attention to important details related to her job responsibilities. One example is the fact that Ms. Grissom incorrectly believed that the Access databases “only have available the information that is current for that week,”35 only to learn through the litigation process that in fact DSD
25 Roberts, 43. 26 Nair, 18. 27 Roberts, 61; Jew, 80; Gelona, 62. 28 Gelona, 61; Jew, 16-17. 29 Gelona, 61. 30 Nair, 20. 31 Grissom (9/7/10), 17-18, 108. 32 Grissom (9/7/10), 70. 33 Roberts, 20. 34 Roberts, 60, 63. 35 Grissom (8/5/10), 41.
8
can restore data for any period.36 It is difficult to believe that DHS would put significant effort into long-term data backup procedures while the head of TGU, a group that benefits from such backup data, would remain unaware of this important service. A more likely explanation is a lack of attention to detail by Ms. Grissom. Another example is that Ms. Grissom is aware that changes to the KIDS System require changes to the Access queries, but she did not implement any policies or procedures for checking if changes to the KIDS System were incorporated into the queries. Instead, she expected her group members to manage that process without monitoring the results.37
TGU does not track or monitor whether DHS employees access the child welfare reports, let alone who accesses those reports or how frequently they do so.38 Furthermore, certain reports continue to be produced that have not been used for years and contain significant errors.39 As a specific example, the YI624 report has not been updated in years, so the computer code ignores many tables that are currently available and that were not there when the code for this report was developed.40 This report – which is still available to every child welfare worker, supervisor and manager in the state – incorrectly shows about 7,000 children with no placement. Providing access to such inaccurate reports without tracking if anybody is using the data is a dangerous practice that should be stopped.
Mr. Nair’s management of the DSD reports programmers was deficient for a number of reasons:
Mr. Nair did not have adequate experience with Access and WebFOCUS to manage programmers who utilized WebFOCUS exclusively and who created extracts whose sole purpose was the creation of Access queries.41 Furthermore, he did not conduct meetings with the DSD reports programmers to review the programs they wrote to ensure that they met any standards and to allow for easier software integration.42
Mr. Nair focused on managing the DSD reports programmers’ time, i.e., ensuring that they had sufficient time to fulfill their tasks. 43 He did not focus enough on task-specific management, i.e., ensuring that those tasks met the necessary standards.
Mr. Nair had poor oversight of his team and the software development and data management process. Job partitioning among the DSD reports programmers in his
36 Mr. Gelona thought that backups are available only up to five weeks due to the backup tape rotation process; this is also incorrect (Gelona, 112). Like Ms. Grissom, Mr. Gelona displays a disturbing lack of attention to important details. 37 Grissom (9/7/10), 116-117. 38 Grissom (9/7/10), 65, 85, 90-91; Gelona, 71, 144-145, 152. 39 Gelona, 31. 40 Gelona, 31-32, 135. 41 Nair, 102, 121. 42 Nair, 116-118; Jew, 34-35, 92. 43 Nair, 13-14.
9
group was such that the programmers’ tasks were too specialized and they were not required to rotate positions in order to become familiar with the functions carried out by other team members. It was evident from Mr. Jew’s deposition that he is not aware of many tasks that Mr. Gelona performs.44 This is probably due to a lack of rotations in this group and suggests a poor management practice. A rotation practice in the software engineering community is aimed at ensuring that system development and maintenance is not overly dependent on a single programmer. Here, it appears that if Mr. Gelona leaves, Mr. Jew would not be prepared to replace him because his current tasks are overly specialized. As a result, there could be significant problems in the future if Mr. Gelona ever leaves DHS. When Ms. Grissom was asked, “What would happen if Mr. Gelona left tomorrow?” her answer was, “We’d all have to learn a lot.”45 In my opinion, that is a serious underestimation. In practice, it would take a long time to learn such a complex set of tasks and this would be even more challenging given that no one at TGU has any training in the required computer programming skills.
Furthermore, Ms. Grissom, Mr. Nair and the personnel who report to them rely
completely on co-location to ensure adequate communication between the TGU reports
group and the DSD reports programmers.46 While co-location is potentially useful, it is not
in any way an adequate substitute for formal processes to monitor data quality and to
manage changes in related applications and outputs, including the child welfare reports,
when the underlying KIDS System is changed. There is a total lack of routine and formal
daily communication between the TGU reports group and the DSD reports programmers.47
Finally, Ms. Grissom, Mr. Nair and the personnel who report to them have allowed
ineffective change control and quality control practices to be implemented. These concerns
are discussed in detail below and are typically addressed by insisting on appropriate
software engineering practice and close formal interactions with members of applications
groups that depend on the programming team’s output.
Overall, neither Ms. Grissom nor Mr. Nair provided effective oversight of their
respective groups. Lax management practices and poor organization have enabled the
inadequate data management practices discussed below.
5. Lack of Adequate Change Control
Software code, including the software code that underlies the KIDS System, is
constantly changing as the computer program is modified and updated. It is necessary to
have the proper process in place to manage these changes in order to ensure that those
44 Jew, 28-29, 36-38, 57-58, 92, 95. 45 Grissom (9/7/10), 74. 46 Nair at 86, 116-118; Grissom (9/7/10), at 14, 34-35. 47 Grissom (9/7/10), at 34-35, 46-47, 109-110, 118-119; Nair at 91-92, 116.
10
changes trickle down to, and are properly implemented in, all programs and applications
that interact with the software. The KIDS System is no exception because it interacts with
other programs and applications, including those that are used to create the child welfare
reports. DHS has failed to implement adequate change control systems. As a result, there
is a high risk that the data maintained in the KIDS System, and in the applications that
interact with the KIDS System, is unreliable.
a. Failure to Utilize a Source Code Version Control System
The KIDS WebFOCUS extracts and the Access queries appear to lack a “source code
version control” system. Source code version control systems have a number of functions,
including: (1) recording the changes made to software source code and storing every
version of the source code; (2) allowing a line-by-line comparison of any two versions of
the software source code; and (3) providing rollback support, which is an effective way of
returning to any previously-tested and committed version of the software48 when errors
are found in the current operational version. Essentially, source code version control
systems function for software much as version control systems function for word-
processed documents: users can save multiple versions, compare them to each other and
revert to old versions if necessary.
The WebFOCUS extracts and the Access queries need a tool that records the history
of changes made to their software source code and rollback support that allows for
reversion to previous versions of the source code when errors emerge. Errors are
unavoidable; however, it is necessary to have a system in place for addressing them by
returning to a version of the program that predates the error. Source code version control
systems have long been readily available and are standard software engineering practice.
For example, CA Software Change Manager (previously called Harvest) was developed in
the early 1970s and more recent products include Concurrent Versions System (called
CVS), Subversion (called SVN), and Global Information Tracker (called Git).
Instead of using a source code version control system, software revisions in the
WebFOCUS extracts are tracked informally based on comments written in the source code
by the computer programmers, accompanied by their initials to identify who made the
change.49
48 Committing is a “check-in” process that consists of submitting to the source code version control system as a bundle a set of changed files along with a description of the specific changes and the evaluation process. This is aimed at ensuring software quality and tractability of changes. 49 Gelona, 37-38; Jew, 49, 88-89.
11
A similarly informal system was used for the Access reports. I examined this more
closely by analyzing the system used to create the Access reports. I began by examining the
names and descriptions of the queries used to create these reports.50 These documents
only provide brief descriptions of the queries rather than their detailed structure and a
history of all changes. I was able to obtain a more detailed view of some queries by running
Microsoft Access and reviewing the SQL code underlying these queries. The notes
associated with these queries do not provide much additional information other than the
date of creation and the date of last modification of a query. For example, according to the
time stamps in the database, a query for “Count of Children by Age Group” was created on
July 26, 2002 and was last modified on August 28, 2009. No information was provided
about what modification was made on August 28, 2009 or whether there were additional
modifications in the seven preceding years. Similarly, the creation time stamp for the
query “Count of Children by Age Graphed” is November 7, 2007 and the last modification
was on February 11, 2009. If DHS utilized a source code version control system, it would
be easy to determine the differences between these two versions of the query as well as to
find out how often and to what extent the query was modified.
The KIDS System evolves over time. When an error is found in the KIDS System it is
risky and time-consuming to make immediate additional changes to the computer program
as any modification could result in further errors unless it is rigorously tested before
deployment. In the meantime, a source code version control system allows a quick revert
to a known reliable version. It also allows the administrators to track the changes and
determine what caused the error.
b. Failure to Utilize Database Change Management Software
The KIDS System also appears to lack “database change management software.”
This software is necessary when a database interacts with other computer programs and
applications because it allows you to (1) track any changes made to the database; (2)
ensure that those changes are properly applied to all programs that interact with the
database; and (3) return to any previous state of the database by restoring all tables, fields,
relationships, views, functions, queries and other elements by using a rollback function.
One example of a database change management tool which allows for rollback is Liquibase.
At his deposition, Mr. Nair described the only change management technology in use
at DHS, a program called Remedy. The Remedy program tracks changes at the level of a
KIDS System release, not at the level of specific enhancements to the KIDS System. So, the
Remedy program cannot be used for testing, data quality control or auditing of changes to
50 E.g., YI684 Queries with Descriptions (Data C&R-11-00001-00044); YI678 Queries with Descriptions (Data C&R-9-00001-00015).
12
the KIDS System. Mr. Nair testified further than Remedy only works at the level of “system
changes” and does not “track to [the] level” of specific changes to a WebFOCUS or
PowerBuilder program.51 Database change management software should track all changes
that are made to the database that will affect any software that depends on the database,
including all software used to generate reports from the database.
My analysis of the “version notes” that accompany the periodic (usually monthly or
bimonthly) releases of new versions of the KIDS System52 shows that the KIDS System
regularly undergoes many structural changes.53 Database change management software
should have been utilized to ensure that these changes were properly applied to all
programs and applications that interact with the KIDS System, including those used to
create the data extracts that underlie the WebFOCUS and Access reports. Database change
management software would also have allowed DHS to return to any previous state of the
database if an error was identified. DHS’s failure to utilize appropriate change
management techniques almost certainly results in errors in programs and applications
that rely on the KIDS System for input; without a way of returning to a functional version of
the database, these errors will be difficult to correct.
Instead of utilizing database change management software, DHS relies entirely on
co-location, and the resulting informal communication between members of TGU and the
reports programmers, to ensure that every relevant change made to the KIDS System is
communicated to the DSD reports programmers. The testimony amply demonstrates
DHS’s complete reliance on co-location and informal communication to ensure that
changes to the KIDS System are properly implemented in the programming code for the
Access and WebFOCUS reports. For example:
From the deposition of Mr. Nair, it is clear that he relies very heavily on the concept of co-location to ensure that the Access and WebFOCUS reports are accurate.54 Mr. Nair’s reliance on co-location is alarming given his computer science background because it is not a sound practice solution for change management. Furthermore, Mr. Nair did not “think there’s one single person that has the responsibility of making sure that any change we make on KIDS . . . what change it will have on the reports. We don’t have one person responsible for that.”55 He was not aware of any systematic way in which change control is effected or formal meetings where change control is discussed.56 This testimony was very surprising because the
51 Nair, 37-40, 72-75, 92. 52 Nair, 38, 90. 53 2004 Version Notes; 2005 Version Notes; 2006 Version Notes; 2007 Version Notes; 2008 Version Notes; January 2008 Version Notes; 2009 Version Notes. 54 Nair, 81, 86, 89-90, 92-93. 55 Nair, 87. 56 Nair, 87-89, 117-118.
13
version notes provide strong evidence that changes in the KIDS System included many data schema modifications that could easily impact the reports.
Mr. Gelona and his team of DSD reports programmers are not keeping track of changes made to the KIDS System and he does not regularly review the KIDS System version notes, or receive a summary of those version notes from anyone, in order to determine whether there have been any important changes to the KIDS System.57 Instead, he relies completely on TGU to tell him when a change occurs in the KIDS System that would require him to modify his programming code.58 Mr. Gelona insisted that his YI684 extract file, which is used to create the YI684 database and consists of 8,000 lines of computer programming code, does not require any regular updating or maintenance, no matter what changes are made to the KIDS System.59 Further, Mr. Gelona operates without source code version control management software or database change management software that would help to ensure that his extracts and his reports are accurate and to allow easy rollback to any previous version if needed.
Mr. Jew also fails to utilize source code version control and data change management tools. He does not even keep a record of which WebFOCUS reports he writes the source code for.60 Like Mr. Gelona, he relies entirely on comments he writes in the source code to record the changes he makes,61 but this approach does not provide a sufficient record of changes or rollback options to previous configurations. Furthermore, the only way Mr. Jew learns about changes to the KIDS System is from the TGU reports group, either directly or indirectly. He does not regularly receive or review the version notes to the KIDS application.62
Ms. Grissom relies on the concept of co-location, and places a great deal of trust in the DSD reports programmers,63 but does not have the computer programming background to evaluate whether they are utilizing best practices.64 Ms. Grissom did acknowledge that whenever a change is made to the KIDS System, it is necessary to make changes to affected Access and WebFOCUS queries but testified that TGU has no formal policy in place to ensure that this is done.65
Ms. Roberts is not familiar with sound software engineering methods for source code version control or database change management. In fact, Ms. Roberts does not regularly review the version notes to identify changes, errors and/or bugs that
57 Gelona, 128-130, 178-180, 233. 58 Gelona, 123-127, 130-132. 59 Gelona, 143. 60 Jew, 59. 61 Jew, 49, 88-89. 62 Jew, 89-95. 63 Grissom (9/7/10), 52, 117-119. 64 Grissom (9/7/10), 17. 65 Grissom (9/7/10), 110-111, 115-119.
14
might impact the reports for which she is responsible.66 She only becomes aware of changes that might affect reports through verbal discussions with Ms. Clabo or Ms. Streber, two other members of TGU. Ms. Roberts is responsible for discussing changes to the KIDS System with the DSD reports programmers if a change affects a report.67
The practices outlined above are totally insufficient, as evidenced by the recently-
discovered problems with the Access reports uncovered by Mr. Gelona (discussed in detail
below). It is unlikely, however, that these problems are limited to the Access reports. The
reliance on co-location and informal communication from non-programmers to ensure
change control has the potential to adversely affect every child welfare report generated by
DSD and TGU.
To summarize, if a change is made to the KIDS System, and that change affects the
programming code underlying a child welfare report, the process currently in place does
not ensure that the programming code will be updated. Thus, the child welfare report will
no longer be reliable or up-to-date. Because of the frequency with which changes are made
to the KIDS System, it is likely that this problem is widespread and that many child welfare
reports are adversely affected. No user of a child welfare report can rely on that report
being accurate or up-to-date.
6. Lack of Adequate Quality Control
In addition to a lack of adequate change control, the KIDS System suffers from a lack
of quality control. This lack of quality control affects not only the KIDS System itself, but
also the child welfare reports that are based on information stored in the KIDS System.
Neither the DSD reports programmers nor the TGU reports group sufficiently tests the
child welfare reports to ensure that they are accurate before they are made available to
child welfare workers, supervisors and managers.
Turning first to the KIDS System, the version notes that I reviewed list numerous
functionality errors that went undetected during the KIDS System software development
and testing process. A list of some specific examples is provided in Supplement 4. The
presence of so many errors suggests insufficiently rigorous software testing protocols. In
addition, I was surprised that Mr. Nair, given his position, was not sure whether the testing
done by contractors from Oklahoma University is only done for major enhancements or for
every modification to the KIDS System.68 This suggests that feedback to his group at DSD
from these testers was missing or that management of the programmers and/or testers
was poor.
66 Roberts, 56, 83 67 Roberts, 54-60, 83-87, 102-103, 112-113. 68 Nair, 79.
15
Furthermore, Ms. Grissom is responsible for initiating changes to the KIDS System,69
yet she has no formal or informal programming background. She claims that any errors in
the child welfare reports are limited to the Access queries, while the KIDS System itself is
fine.70 Her confidence in the KIDS System is misplaced. For example, she suspects that a
“documentation error” is the reason why only one child is listed in a report as being in a
DHS-operated facility.71 Assuming Ms. Grissom is correct, this suggests that there are
insufficient quality control procedures at the data entry stage such as poor training of
workers and/or inadequate data entry testing procedures. Furthermore, the fact that Ms.
Grissom’s team has not found such an obvious error is another serious concern as this kind
of outlier in a dataset is very easy to spot by simple statistical techniques.
This same lack of quality control permeates the process for evaluating the software
used to create the child welfare reports. Whenever programmers develop a piece of
programming code, it is standard practice to perform both “verification” and “validation.”
Verification is the process of evaluating software to determine whether it has been
designed in accordance with the specifications. Validation is the process of evaluating
software to determine if those specifications were correct in the first place, i.e., whether the
software actually meets the users’ needs. In other words, verification ensures that you
built it right, while validation ensures that you built the right thing. The software testing
protocol should include “white box testing,” or rigorous evaluation of the internal
structures of the software by the programmers who developed it, which can uncover many
errors in individual units of source code (e.g., control and data flow errors and branching
errors in the implementation of an algorithm). The software testing protocol should also
include “black box testing,” or rigorous evaluation of the software by people who were not
involved in its development by selecting valid and invalid inputs to determine if the
functional requirements are satisfied. At least for the KIDS System, DHS appears to utilize
white box testing by technical staff at DSD.72
However, DHS does not appear to follow any of these standard practices for the
child welfare reports. Instead, the DSD reports programmers and the TGU reports group
utilize “face validity” testing only, which assesses, without reference to any defined
standards, whether the reports “look like they work” without rigorous evaluation of
whether that is the case. The child welfare reports do not undergo any standardized
testing protocol before being released to child welfare workers, supervisors and
managers.73 This is completely inadequate and leads to a high likelihood that the computer
69 Grissom (10/1/08), at 155; Gelona at 62-63. 70 Grissom (9/7/10), 59. 71 Grissom (8/5/10), 23. 72 Nair, 76. 73 Grissom (9/7/10), 27-28.
16
code used to create the child welfare reports – and the reports themselves – contains
errors. Specifically:
According to Ms. Grissom, she and her team look at the outcome on a report, compare it to other similar reports and do a “number of different kinds of testing.”74 This is problematic because if they overlook an error by an informal comparison to some of the prior reports, then that error will become more difficult to identify later. A sound testing procedure is needed to minimize the possibility that errors stay undetected for long periods of time. Ms. Grissom believes that the known problems with the Access reports (discussed below) would not have happened if the Access queries were “reviewed by the person who created [them] to make sure that the logic was good.”75 This is a naïve view of the software testing process. Instead, Ms. Grissom, as the Programs Administrator of TGU, should have enforced rigorous testing procedures, including internal and external testing and regular updates of queries by using the test protocols any time that a data schema affecting the query was modified. All versions of modified queries should have been committed and queries should have been evaluated over time. Ms. Grissom has not put into place any rigorous testing or change management protocols to ensure data quality, leading to a high risk that every child welfare report contains errors.
Ms. Roberts utilizes face validity testing to ensure that the reports she is responsible for are accurate. Basically, her quality control measures consist of nothing more than relying on the fact that she, a member of the TGU reports group or a member of the child welfare field staff will notice an error in the report itself.76 In an attempt to identify errors, she looks at the data in the reports and compares it to the information in the KIDS System. However, when a report contains aggregate data, “I don’t know that I have anything specifically to check [the report] against;” instead, all she is able to look at is “what has the data been telling you over time and . . . is this a reasonable fluctuation or steady line and is anybody questioning you about it.”77 Ms. Roberts’ objective seems to be to identify only those problems that “stick[] out like a sore thumb.”78 For the federal reports, Ms. Roberts also uses a compliance utility, one of the two utilities that were given to her by the Children’s Bureau, to help ensure data consistency and quality, but this utility is only used to identify obvious problems.79 Ms. Roberts stated that the other utility provided by the Children’s Bureau, the data quality utility, is not used very often, and that she is not certain how the utility functions.80
Mr. Nair testified that there was no systematic testing of any child welfare reports,
74 Grissom (9/7/10), 27-28, 38-39. 75 Grissom (9/7/10), 49. 76 Roberts, 50-51. 77 Roberts, 100-102. 78 Roberts, 51. 79 Roberts, 67-70. 80 Roberts, 70-71.
17
aside from the use of utilities provided by the federal government to test the AFCARS and NCANDS data. He was not aware of any systematic way in which WebFOCUS reports are tested after they have been put into production to ensure that they are still generating valid data.81 Mr. Nair had more confidence in the federal reports because of the utilities provided by the Children’s Bureau.82 This confidence, however, is misplaced. Mr. Nair seemed unaware of the fact that according to Ms. Roberts, the compliance utility is only used to identify obvious problems and she rarely uses, and does not fully understand how to use, the data quality utility.
Mr. Gelona does not perform any rigorous white box software testing on the programming code he writes for the child welfare reports. Instead, he relies on the TGU reports group and end users – none of whom have computer programming experience – to identify problems with the reports and, for the federal reports, relies on the Children’s Bureau utilities, though, as stated above, the data quality utility is infrequently used.83 Furthermore, contrary to accepted practice, Mr. Gelona concludes that his extracts from the KIDS System are accurate as long as the users of the reports that are generated do not tell him there is a problem with the report.84 He makes changes to his extracts only if asked to add a new field or if somebody tells him to make a change based on changes to the KIDS System.85 These practices are highly unsound and problematic. Given the lack of personnel with programming backgrounds in TGU and the limited scope of the Children’s Bureau utilities, problems with the data can easily go undetected.
According to Mr. Gelona, he has not written any of the queries used to create the Access reports.86 Although he prepares the data extracts used to write these queries and is co-located with the TGU reports group who write these queries, it appears that Mr. Gelona is not collaborating very closely with TGU. In particular, he testified that he never checked the Access queries until the last six months when he was asked to assist with running the YI684 queries for the quarters ending March 2007 to March 2010 (for the purposes of this litigation).87 It seems as though there was no ongoing quality control with respect to the Access queries. It is very poor practice that he, or someone else from DSD, was not asked to comprehensively check the Access queries considering that the people who write and manage these queries do not possess strong query writing or programming skills and that their quality control testing relies heavily on comparing data in a very short time period (according to Ms. Grissom’s deposition, for the reports that are produced on an
81 Nair, 125-126. 82 Nair, 104. 83 Gelona, 74-75, 131-134, 207-208. 84 Gelona, 131-133. 85 Gelona, 123. 86 Gelona, 12. 87 Gelona, 138-139, 149-150, 240.
18
ongoing basis they have only two hours per week to do so).88
Most of the issues raised by Mr. Gelona’s testimony are also applicable to Mr. Jew’s testimony. Mr. Jew verifies his programming code in a completely informal way by manual spot checks and he only checks his programming code if someone reports an error to him.89 Mr. Jew acknowledged that it is not always possible to check a report against a screen in the KIDS System, if, for example, the report contains aggregate information.90 For federal reports, Mr. Jew manually spot checks some of the information in the AFCARS file against the information in the KIDS System and occasionally, but “[n]ot often,” uses the Children’s Bureau utilities.91 In his view, ultimate responsibility for verifying the reports belongs to the TGU reports group.92 These are not sound software testing practices.
The face validity testing and Children’s Bureau utility (which is used only for federal
reporting) are not a substitute for rigorous quality assurance and testing practices within
the agency. These inadequate quality control procedures equally affect the Access and
WebFOCUS reports. The Access reports, however, also suffer from the fact that the queries
themselves are written by poorly trained, non-programmers with little oversight from the
computer professionals in DSD. In my opinion, there is a significant risk that every child
welfare report contains inaccurate data because of these quality control issues.
7. The Child Welfare Reports Are Wrong
It is unquestionably important that the information contained in the reports used by
the child welfare workers, managers and supervisors in Oklahoma be accurate, complete
and up-to-date. Indeed, Ms. Grissom, Ms. Roberts and Mr. Gelona all rightly testified that
this was true.93 Furthermore, it is clear from testimony that the Access and WebFOCUS
reports are actually being used by child welfare workers, supervisors and managers. 94
Unfortunately, DHS lacks auditing capabilities that would allow for anyone to track the use
of these reports in a precise and detailed way.95
DHS recently discovered serious problems with numerous Access reports. Although
the DSD reports programmers do not routinely check or monitor the Access queries,96
during the summer of 2010, Mr. Gelona was asked to look into these queries for reasons
88 Grissom (9/7/10), 122. 89 Jew, 71, 83-84, 87. 90 Jew, 70-73. 91 Jew, 83-85, 87-88. 92 Jew, 69-76. 93 Grissom (9/7/10), 230-231; Gelona, 230-232; Roberts, 78-79, 100, 112, 114-115. 94 Grissom (9/7/10), 78-80, 87, 89-93; Roberts, 25, 28-29, 96-100, 105, 111-112. 95 Grissom (9/7/10), 65, 85, 90-91; Gelona, 71, 144-145, 152. 96 Gelona, 240.
19
related to this litigation.97 When he did, he found numerous problems with the YI684
Access queries and documented those problems in a report titled “Problems with YI684
queries for Litigation Discovery.”98 Mr. Gelona’s report discusses general problems that
affect all of the YI684 queries, including (1) flaws in the way counties and areas are
determined and (2) inconsistencies in area and county name labels. The report also
describes in detail problems with 60 specific queries. Mr. Gelona testified that some of
these problems will “have a huge effect on the queries” and that between 50 and 70 percent
of the YI684 queries he reviewed were affected by the problems he discovered.99 Ms.
Grissom and Mr. Nair understood that these problems likely infected all of the Access
reports and possibly the WebFOCUS reports as well.100 All of the DSD and TGU personnel
who were asked about these issues, including Ms. Grissom, Mr. Nair and Mr. Gelona,
expressed serious concerns about these problems.101
Mr. Gelona attributed these problems to the fact that “that’s what happens when you
have non-professional programmers write programs . . . [T]hey don’t fully understand all of
the data they have got or . . . all of the relationships between the data.”102 This is a serious
over-simplification of the reasons for these errors. While it is true that this is one reason
for the problems, and the WebFOCUS programming code was probably written more
skillfully because professional programmers were in charge instead of the non-
programmers at TGU, the lack of change control and quality control described above make
errors in all of the reports highly likely.
I tried to analyze the reasons behind the specific errors identified in Mr. Gelona’s
report. Given the limited documentation of revisions made to the source code (described
above) it was impossible to fully understand the process that led to the errors (e.g.,
whether the errors were due to structural changes to the KIDS System, structural changes
to the WebFOCUS extract or other reasons). In order to fully undertake this analysis, it
would be necessary for DHS to utilize a source code version control tool that allows for
rollback to undo changes in the database (or at least retrieval of queries and data tables by
specific dates). Despite the lack of this tool, I was able to identify a number of queries that
appear to have been affected by changes to the KIDS System that are listed in the version
notes; these changes were not properly implemented in the Access queries. Contrary to
97 Grissom (9/7/10), 58-59, 149-150. 98 Deposition Exhibit 319. 99 Gelona, 151-152. 100 Grissom (9/7/10), 94-97; Grissom (8/5/10), 50, 53; Nair, 101-104. 101 Gelona, 151-153; Grissom (8/5/10), 50, 53; Grissom (9/7/10), 62, 230-231; Nair, 95-97. Ms. Frye also expressed concern about these problems in an email to Ms. Grissom (Issuesw-AccessComm-00001). 102 Gelona, 137-138.
20
Mr. Gelona’s assertion, this is not simply an issue of non-professional programmers; it is a
fundamental problem of a failure to implement standard database management practices.
In addition to outright errors, there is also the misuse of titles and terminology in
the child welfare reports. For example, Ms. Grissom testified that a report titled “Count of
Children in Foster Care by Area” is mislabeled because “the title does not reflect what’s in
the report.”103 Furthermore, multiple definitions were used for the same term in the
reports. One example is the term “family foster care,” as discussed by Ms. Grissom.104
Mislabeling reports and using the same term in different ways in different reports is
confusing and bad practice. Users of these reports could easily make mistakes and misuse
the reports as a result of these practices.
Fundamentally, the poor database and software management practices described
above make it likely that other Access and WebFOCUS reports beyond those identified by
Mr. Gelona contain errors. It is my opinion that the errors in these reports are numerous
and it is highly likely that many of the reports are inaccurate. In my opinion, it is harmful to
the children in DHS custody for DHS to continue to use those reports because they cannot
be relied upon to provide complete, accurate or up-to-date information. While the overall
damage to end users cannot be completely assessed because of the lack of auditing
capabilities, the widespread use of the reports makes it highly likely that child welfare
workers, supervisors and managers who are directly responsible for the welfare of
children in DHS custody are utilizing reports that are not reliable, accurate, complete or up-
to-date.
__________________________________
Zoran Obradovic, Ph.D.
March 15, 2011
103 Grissom (8/5/10), 55. 104 Grissom (8/5/10), 69.
Supplement 1
ZORAN OBRADOVIC (last update: Jan. 16, 2011)
ADDRESS: Center for Information Science and Technology,
Temple University, 303 Wachman Hall (038-24) 1805 N. Broad St., Philadelphia, PA 19122, USA Phone (215) 204-6265, FAX: (215) 204-5082 E-mail: [email protected], WWW: http://www.ist.temple.edu/~zoran
RESEARCH INTERESTS:
Data Mining; Machine Learning; Spatial and Temporal Data Management; Bioinformatics. TEACHING INTERESTS:
Data Mining; Bioinformatics; Machine Learning; Databases; Data Warehousing; Algorithms; Time Series Analysis, Geographical Information Systems, Pattern Recognition; Neural Networks; Intelligent Data Analysis; Artificial Intelligence; Parallel and Distributed Computing.
EDUCATION:
Ph.D. in Computer Science, The Pennsylvania State University, May 1991. Dissertation: “Discrete Multi-Valued Neural Networks." M.S. in Mathematics and Computer Science, University of Belgrade, June 1987. Thesis: “High-Speed Parallel Computing." B.S. in Applied Mathematics, Computer and Information Sciences., Univ. of Belgrade, Dec. 1985.
PROFESSIONAL EXPERIENCE:
Director 2000 - Present. Center for Information Science and Technology, Temple University, Philadelphia, PA. Professor (tenured) 2000 - Present. Computer and Information Sciences Department, Temple University, Philadelphia, PA. Associate Director 2003 - 2004. Center for Quantitative Biology and Biomedical Mathematics, Temple University, Philadelphia, PA. Associate Professor (tenured) 1997 - 2000. (Assistant Professor, 1991 - 1997.) School of Electrical Engineering and Computer Science, Washington State Univ., Pullman, WA. Guest Scientist, Fall-Winter 1998. Information and Communications Department, Corporate Research and Development, Siemens AG, Munich, Germany. Adjunct Research Professor, 1999 - Present. (Adjunct Scientist, 1986 - 1999) The Mathematical Institute, Academy of Sciences and Arts, Belgrade, Serbia.
GRANT SUPPORT: Funded Projects: • Shi, J., Obradovic, Z. (Jan. 2011 – Aug. 2011) “Integrated Data Warehouse,” City of Philadelphia, Project
21100816140717, $135,064. • Unterwald, E.M. et al (July 2010 – June 2015) “Center on Intersystem Regulation by Drugs of Abuse,”
Database and Drug Interaction Core (with Tallarida, R.), National Institute of Health, Grant 2P30DA013429, $812,301 direct cost per year.
• Wu, J., Bishwas, S.K., Bai, L., Criner, G.G., Galvinski, E.T., Klein, M.L, Kohlweyer, A, Kwatny, G., Obradovic, Z., Rivin, I., Shi, Y., (May 2010 – April 2013) “MRI-R2: Aquisition: A Hybrid High-Performance GPU/CPU System,” National Science Foundation, NSF-CNS-0958854, $839,221.
• Kelsen, S., Merali, S., Obradovic, Z. (Sept. 2009 – Aug. 2011) “Ancillary Study: Identification of Plasma Biomarkers in Chronic Obstructive Pulmonary Disease,” National Institute of Health, Grant 1RC2HL101713-01, $784,389.
• Obradovic, Z. (Oct. 2008 – Dec. 2011) “Improving Biomedical Informatics Support at Temple Health Sciences Center,” The Pennsylvania Department of Health, $300,000 (direct costs).
• Dunker, A.K., and Obradovic, Z. (June 2008 – May 2010) “Bioinformatics Linkage of Protein Disorder and Function,” National Institute of Health, Grant R56 LM007688-05A1 $441,508.
• Obradovic, Z., Vucetic, S. and Z. Li (Aug. 2006 – July 2011) “Collaborative Research: Data Mining Support for Retrieval and Analysis of Geophysical Parameters,” National Science Foundation, NSF-IIS-0612149, $600,404 ($400,207 to Temple University).
• Harris A., Obradovic, Z., Izenman, A., Mennis, J. (Sept. 2006 – Sept. 2009) “Investigating Simultaneous Effects of Individual, Program and Neighborhood Attributes on Juvenile Recidivism Using GIS and Spatial Data Mining,” National Institute of Justice, GMS Award 2006-IJ-CX-0022, $ 316,714.
• Soprano, D.R., Soprano, K.J., Obradovic, Z. and Vucetic, S. (April. 2005 – Dec. 2009) “PBX and Retinoic Acid-Dependent Differentiation,” National Institute of Health, NIH- 1 R01 DK070650-01, $1,586,250.
• Obradovic, Z. and Vucetic, S. (June 2004 – April 2008) “Applications of Bioinformatics Data Analysis to Cardiovascular and Cancer Research,” The Pennsylvania Department of Health, $250,000 (direct costs)
• Megalooikonomou, V., Obradovic, Z., Boyko, O.B., Gee, J. (January 2004 – December 2007) “Large Scale Data Analysis for Brain Images,” National Institute of Health, Grant NGA: 1 R01 MHO68066-01A1, $1,284,246.
• Dunker, A.K., and Obradovic, Z. (Sept. 2003 – Sept. 2007) “Bioinformatics Linkage of Protein Disorder and Function,” National Institute of Health, Grant R01 LM007688-01A1, $1,291,356.
• Harris A., Obradovic, Z., Izenman, A., Mennis, J. (July 2006 – Dec. 2006) “Investigating the Simultaneous Effects of Individual, Program and Neighborhood Attributes on Juvenile Recidivism Using GIS and Spatial Data Mining,” Institute of Public Affairs, Temple University, $16,320.
• Obradovic, Z. and Vucetic, S., (August 2002 - July 2006) “ITR/Small/Scientific Frontiers: Task-Specific Data Reduction and Mining in Spatial-Temporal Domains," National Science Foundation, Grant 0219736, $210,120.
• Obradovic, Z. and Vucetic, S., (June 2004 – Aug. 2004) “REU Supplement for ITR: Task-Specific Data Reduction and Mining for Spatial-Temporal Domains," National Science Foundation, $12,000.
• Kwatny, E., Stafford, R., Megalooikonomou, V. and Obradovic, Z., (Sept. 2001 - Sept. 2004) High Performance Network Connection for Knowledge Discovery Research," National Science Foundation, Grant NSF-ANIR-0124390, $353,100 ($ 150,000 from NSF).
• Obradovic, Z. and Vucetic, S. (January 2004 – June 2004) “Research Infrastructure and Expertise for Gene Expression Data Analysis,” The Pennsylvania Department of Health, $70,000 (direct costs).
• Obradovic, Z., Chang, F.N., Tuszynski, G. P. and Vucetic, S. (January 2004 – June 2004) “Mining High Performance Liquid Electrophoresis Data,” Temple University, $8,000 (direct costs).
• Wolfgang, P., Obradovic, Z., Megalooikonomou, V. and Vucetic, S., (June 2003 – December 2003) “Visualization and Analysis of Commercial Flight Data,” Lockheed Martin Corp., $49,000
• Obradovic, Z. (January 2003 – August 2004) “An Efficient System for Discovering Patterns and Associations at Earth Observation Databases,” New Previously Unfunded Directions for Established Investigators Grant Application, Temple University, $30,000.
• Obradovic, Z. (March 2001 - September 2001) “Data Reduction for Spatial-Temporal Knowledge Discovery," Idaho National Engineering and Environmental Laboratory, LDRD Program under DOE contract DE-AC07-99ID13727, $50,000.
• Dunker, A.K and Obradovic, Z., (May 2000 - May 2003) “Bioinformatics, Disordered Proteins and Function," The National Institute of Health, Grant 1 R01 LM06916-01, Biotechnology, $984,026
• Obradovic, Z. and Tomsovic, K., (August 2000 - August 2004) “Towards an Understanding of Deregulated Electricity Markets through Time Series Analysis," Power Systems and Intelligent Systems Programs, Division of Engineering, National Science Foundation, Grant ECS-9988626, $240,000.
• Obradovic, Z. and Dunker, A.K., (June 1998 - December 2001) “Intelligent Data Analysis for Identifying Protein Disorder," cross-disciplinary funding by KDI Knowledge and Distributed Intelligence Initiative, Division of Information and Intelligent Systems and Division of Molecular and Cellular Biosciences, National Science Foundation, Grant IIS-9711532, $379,910.
• Obradovic, Z. and Dunker, A.K., (January 2000 - May 2001) “Supplement to Intelligent Data Analysis for Identifying Protein Disorder," Knowledge and Cognitive Systems Program, National Science Foundation, $50,858.
• Obradovic, Z. and Dunker, A.K., (January 2000 - December 2000) “REU Supplement to Intelligent Data Analysis for Identifying Protein Disorder," Knowledge and Cognitive Systems Program, National Science Foundation, $15,000.
• Obradovic, Z. (January 2000 - December 2000) “Tools for Analyzing Learned Business Valuation Models and for Construction Higher-Representation Value Driving Attributes," Valueminer.com Inc., $40,000.
• Obradovic, Z. and Fiez, T., (January 1998 - September 2000) “Integration of Distributed Heterogeneous Experts for Knowledge Discovery in Precision Agriculture,"Idaho National Engineering and Environmental Laboratory University Research Consortium, $307,832 .
• Obradovic, Z. (May 1999 - May 2000) “An Intangible Assets Analysis System for Identifying Enterprise Value Drivers," Valueminer.com Inc., $36,000.
• Obradovic, Z., (July 1993 - June 1997) “RIA: Efficient and Accurate Prediction Systems for Large Scale Problems," Knowledge and Cognitive Systems Program, National Science Foundation, $100,000.
• Obradovic, Z. and Meador, J., (July 1991- June 1993) “Parametric Fault Diagnosis in Mixed-Signal Integrated Circuits," National Science Foundation Center for Design of Analog-Digital Integrated Circuits. $90,000.
• Obradovic, Z. (Fall 1997 - Spring 1998) “Predicting Disordered Protein Structure from Amino Acid Sequence - Research Project Supervision for R. Sarac," Howard Hughes Undergraduate Research Fellowship.
• Obradovic, Z. (June 1997 - August 1997) “Intelligent Systems for Data Analysis and Modeling - - Research Project for Teachers," WSU / National Science Foundation Summer Teacher's Institute.
• Obradovic, Z. (June 1996) “Neural Network Design for Medical Applications – Research Project for Students," 1996 WSU / Howard Hughes Summer Science and Engineering Scholars Program.
• Obradovic, Z. (June 1995 - August 1995) “Analysis and Comparison of Prediction Systems for Very Noisy Domains - Research Project for Teachers," WSU / National Science Foundation Summer Teacher's Institute.
• Obradovic, Z. and Drossu, R. (June 1994) “Computer-aided Diagnosis in Medicine – Research Project for Students," 1994 WSU / Howard Hughes Summer Science and Engineering Scholars Program.
• Obradovic, Z., (July 1992 - May 1993) “Neural Networks for Large Learning Problems," Washington State University Research Grant-in-Aid Program 1992-93.
PUBLICATIONS: I. BIOMEDICAL INFORMATICS: Biomedical Informatics: Journal Articles
1. Potireddy, S, Midic, U., Liang, C.G., Obradovic, Z., Latham, K.E. (in press) “Positive and negative cis-regulatory elements directing postfertilization maternal mRNA translational control in mouse embryos,” Am J Physiol Cell Physiol 299.
2. Garriga, J., Xie, H., Obradovic, Z., Grana, X. (2010) “Selective Control of Gene Expression by CDK9 in Human Cells,” Journal of Cellular Physiology, vol. 222(1):200-8.
3. Midic, U., Oldfield, C.J., Dunker, A.K., Obradovic, Z., Uversky, V.N. (2009) “Unfoldomics of Human Genetic Diseases: Examples of Ordered and Intrinsically Disordered Members of the Human Diseasome,” Protein and Peptide Letters, vol. 16, no. 12, pp. 1533-1547.
4. Uversky, V.N., Oldfield, C.J., Midic, U., Xie, H., Xue, B., Vucetic, S., Iakoucheva, L.M., Obradovic, Z., Dunker, A.K., (2009) “Unfoldomics of Human Diseases: Linking Protein Intrinsic Disorder with Diseases,” BMC Genomics, vol. 10 Suppl 1:S07.
5. Midic, U., Oldfield, C.J., Dunker, A.K., Obradovic, Z., Uversky, V.N. (2009) “Protein Disorder in the Human Deseasome: Unfoldomics of Human Genetic Diseases,” BMC Genomics, vol. 10 Suppl 1:S12.
6. Li, A., Xie H., Chin, M.H., Obradovic, Z., Smith, D.J., Megalooikonomou, V. (2009) “Analysis of Multiplex Gene Expression Maps Obtained by Voxelation,” BMC Bioinformatics, 10 Suppl 4:S10.
7. Megalooikonomou, V., Kontos, D., Pokrajac, D., Lazarevic, A., Obradovic, Z. (2008) “An Adaptive Partitioning Approach for Mining Discriminant Regions in 3D Image Data,” Journal of Intelligent Information Systems, vol 31, no. 3, pp. 217-242.
8. Dunker, K., Oldfield, C.J., Meng, J., Romero, P., Yang, J., Chen, J.W., Vacic, V., Obradovic, Z. and Uversky, V.N. (2008) “The Unfoldomics Decade: An Update on Intrinsically Disordered Proteins,” BMC Genomics, vol. 9 (Suppl 2):S1, 16.
9. Xu, Q., Canutescu, A., Wang, G., Shapavalov, M.V., Obradovic, Z. and Dunbrack, R.L. (2008) “Statistical Analysis of Interfaces in Crystals of Homologous Proteins,” J. Molecular Biology, vol. 381, pp. 487-507 .
10. Ren, S., Uversky, V.N., Chen, Z., Dunker, A.K. and Obradovic, Z. (2008) “Short Linear Motifs recognized by SH2, SH3 and Ser/Thr Kinase domains are conserved in disordered protein regions,” BMC Genomics, vol. 9 (Suppl 2):S26, 9. Sept.
11. Krynetskaia, N., Xie, X., Vucetic, S., Obradovic, Z., Krynetskiy, E. (2008), “High Mobility Group Protein B1 is an Activator of Apoptotic Response to Antimetabolite Drugs,” Molecular Pharmacology, Jan;73(1):260-9.
12. Xie, H., Vucetic, S. Iakoucheva L.M., Oldfield C.J., Dunker, A.K., Uversky, V.N. and Obradovic, Z. (2007) “Functional Anthology of Intrinsic Disorder. I. Biological Processes and Functions of Proteins with Long Disordered Regions,” Journal of Proteome Research, May 4;6(5):1882-98.
13. Vucetic, S., Xie, H., Iakoucheva L.M., Oldfield C.J., Dunker, A.K., Obradovic, Z. and Uversky, V.N. (2007) “Functional Anthology of Intrinsic Disorder. II. Cellular Components, Domains, Technical Terms, Developmental Processes,” Journal of Proteome Research, May 4;6(5):1899-1916.
14. Xie, H., Vucetic, S. Iakoucheva L.M., Oldfield C.J., Dunker, A.K., Obradovic, Z. and Uversky, V.N. (2007) “Functional Anthology of Intrinsic Disorder. III. Ligands, Postranslational Modifications and Diseases Associated with Intrinsically Disordered Proteins,” Journal of Proteome Research, May 4;6(5):1917-1932.
15. Radivojac, P., Iakoucheva, L.M., Oldfield C.J.,, Obradovic, Z., Uversky, V.N., Dunker A.K. (2007) “Intrinsic Disorder and Functional Proteomics,” Biophysical Journal, vol. 92, March 2007, pp. 1439-1456.
16. Midic, U. Dunker, K. and Obradovic, Z. (2007) “Exploring alternative knowledge representations for protein secondary-structure prediction,” Int’l Journal of Data Mining and Bioinformatics, 1(3):286-313.
17. Sickmeier, M., Hamilton, A., LeGall, T. Vacic, V., Cortese, M.S., Uversky, V.N., Tompa, P., Obradovic, Z. and Dunker, A.K. (2007) “DisProt: The Database of Disordered Proteins,” Nucleic Acids Research, 35(Database issue):D786-93.
18. Xu, Q., Canutescu, A., Obradovic, Z. and Dunbrack, R.L. (2006) “ProtBuD: A Database of Biological Unit Structures of Protein Families and Superfamilies,” Bioinformatics, Dec 1;22(23):2876-82.
19. Han, B., Obradovic, Z., Hu, Z.Z., Wu, C. H. and Vucetic, S. (2006) “Substring Selection for Biomedical Document Classification,” Bioinformatics, Dec 1;22(23):2876-82.
20. Romero, P., Zaidi, S., Fang,Y.Y., Uversky, V.N., Radivojac, P., Oldfield, C., Cortese M., LeGall, T., Obradovic, Z. and Dunker, A.K. (2006)“Alternative Splicing in Concert with Protein Intrinsic Disorder Enables Increased Functional Diversity in Multicellular Organisms,” The Proceedings of the National Academy of Sciences, vol. 103, no. 22, 8390-8395, May 30.
21. Peng, K., Radivojac, P., Vucetic, S., Dunker, A.K. and Obradovic, Z. (2006) “Length-Dependent Prediction of Intrinsic Protein Disorder,” BMC Bioinformatics, vol. 7 (1), 208, April 17.
22. Radivojac, P., Vucetic, S., O’Connor, T.R., Uversky, V.N., Obradovic, Z. and Dunker, A.K. (2006) “Calmodulin Signaling: Analysis and Prediction of a Disorder-Dependent Molecular Recognition,” Proteins: Structure, Function and Bioinformatics, vol. 63(2), pp. 398-410, May 1.
23. Obradovic, Z., Peng, K., Vucetic, S., Radivojac, P., and Dunker, A.K. (2005) “Exploiting Heterogeneous Sequence Properties Improves Prediction of Protein Disorder,” Proteins: Structure, Function and Bioinformatics, vol. 61, Suppl. 7, pp. 176-182.
24. Peng, K., Vucetic, S., Radivojac, P., Brown, C.J., Dunker, A.K. and Obradovic, Z. (2005) “Optimizing Long Intrinsic Disorder Predictors with Protein Evolutionary Information,” Journal of Bioinformatics and Computational Biology, vol. 3, no. 1, pp. 35-60.
25. Vucetic, S., Obradovic, Z., Vacic, V., Radivojac, P., Peng, K., Lawson, J.D., Brown, C.J., Sikes, J.G., Newton, C. and Dunker, A.K. (2005) “Disprot: A Database of Protein Disorder,” Bioinformatics, vol 21, no. 1, pp. 137-40.
26. Pokrajac, D., Megalooikonomou, V., Lazarevic, A., Kontos, D. and Obradovic, Z. (2005) “Applying Spatial Distribution Analysis Techniques to Classification of 3D Medical Images,” International Journal Artificial Intelligence in Medicine, Vol. 33, No 3, pp. 261-80.
27. Romero, P., Obradovic, Z., and Dunker, A.K.(2004) “Natively Disordered Proteins: Functions and Predictions,” Applied Bioinformatics, 3(2-3), pp.105-13.
28. Radivojac, P., Chawla, N. V., Dunker, A.K., and Obradovic, Z. (2004) “Classification and Knowledge Discovery in Protein Databases,” Journal of Biomedical Informatics, vol. 37, pp. 224-239.
29. Iakoucheva, L.M., Radivojac, P., Brown, C.J., O’Connor, T.R., Sikes, J.G., Obradovic, Z. and Dunker, A.K. (2004) “The Importance of Intrinsic Disorder for Protein Phosphorylation,” Nucleic Acids Research, vol. 32, no. 3, pp. 1037-1049.
30. Obradovic, Z, Peng, K, Vucetic, S., Radivojac, P., Brown, C., and Dunker, A.K. (2003) “Predicting Intrinsic Disorder from Amino Acid Sequence,” Proteins: Structure, Function and Genetics, vol. 53 Suppl 6, pp. 566-72.
31. Radivojac, P., Obradovic, Z., Smith D.K., Zhu, G., Vucetic, S., Brown, C., Lawson, J.D. and Dunker, A.K., (2003) “Protein flexibility and intrinsic disorder,” Protein Science, vol. 13, pp. 71-80.
32. Vucetic, S., Brown C., Dunker A.K and Obradovic, Z. (2003) “Flavors of Protein Disorder," Proteins: Structure, Function and Genetics, vol. 52. pp. 573-584
33. Smith, D. K., Radivojac, P., Obradovic, Z., Dunker, A. K. and Zhu, G. (2003) “Improved Amino Acid Flexibility Parameters,” Protein Science, vol 12, pp. 1060-1072.
34. Iakoucheva, L.M., Brown, C.J., Lawson, J.D., Obradovic, Z. and Dunker A.K. (2002) “Intrinsic Disorder in Cell-signaling and Cancer-associated Proteins," Journal of Molecular Biology, vol. 323, pp. 573-584.
35. Dunker, A.K., Brown, C.J., Lawson, J.D., Iakoucheva, L.M. and Obradovic, Z. (2002) “Intrinsic Disorder and Protein Function," Biochemistry, May 28th, vol. 41, issue 21, pp. 6573 - 6582.
36. Dunker, A.K., Brown, C.J. and Obradovic, Z. (2002) “Identification and Functions of Usefully Disordered Proteins," Advances in Protein Chemistry, vol. 62, pp. 25-49.
37. Dunker, A.K and Obradovic, Z. (2001) “The Protein Trinity - Linking Function and Disorder," Nature Biotechnology, vol. 19, Sept., pp. 805-806.
38. Dunker A.K., Lawson J.D., Brown C.J., Romero P., Oh J., Oldfield C.J., Campen A.M., Ratlif, Hipps K.W., Ausio J., Nissen M.S., Reeves R., Kang C.H., Kissinger C.R., Bailey R.W., Griswold M.D., Chiu W., Garner E.C. and Obradovic Z. (2001) “Intrinsically Disordered Proteins," Journal of Molecular Graphics and Modeling, vol. 19, pp. 28-61.
39. Romero, P., Obradovic, Z., Li, X., Garner, E., Brown, C.J. and Dunker, A.K. (2001) “Sequence Complexity and Disordered Protein," Proteins: Structure, Function and Genetics, vol. 42, pp. 38-48.
40. Romero, P., Obradovic, Z and Dunker K. (2000) “Intelligent Data Analysis for Identifying Protein Disorder," Issues on Application of Data Mining, Artificial Intelligence Review, Vol. 14, No. 6, S2, pp. 447-484.
41. Romero, P., Obradovic, Z. and Dunker, A.K. (1999) “Folding Minimal Sequences: The Lower Bound for Sequence Complexity of Globular Proteins," FEBS Letters. vol. 462, pp.363-367.
42. Dunker, A.K., Obradovic, Z., Romero, P., Kissinger, C. and Villafranca, J.E. (1997) “On the Importance of Being Disordered," Protein Data Bank Quarterly Newsletter, Release no. 81, pp. 3-5.
Biomedical Informatics: Peer Reviewed Book Chapters
43. Xie, H., Obradovic, Z. and Vucetic, S. (2009) “Mining of Microarray, Proteomics, and Clinical Data for Improved Identification of Chronic Fatigue Syndrome,” chapter 9 in McConnell, P, Lim, S., and A.J. Cuticchia, Methods of Micorarray Data Analysis VI. (Scotts Valley, California: CreateSpace Publishing, 2009), pp. 119-127.
44. Xie, H., Midic, U., Vucetic, S. and Obradovic, Z. (2008) “Algorithmic Methods for the Analysis of Gene Expression Data,” chapter 4 in Handbook of Applied Algorithms: Solving Scientific, Engineering, and Practical Problems (eds. A. Nayak and I. Stojmenovic), Willey-IEEE Press, pp. 115-146.
45. Uversky V.N., Radivojac, P., Iakoucheva, L.M., Obradovic, Z. and Dunker, A.K. (2007) “Prediction of Intrinsic Disorder and its Use in Functional Proteomics,” chapter 5 in Methods in Molecular Biology vol. 408: Gene Function Analysis (ed. M. Ochs), Humana Press Inc., Totowa, N.J.
46. Peng, K. Obradovic, Z. and Vucetic, S. (2006) “Supervised Learning under Sample Selection Bias from Protein Structure Databases,” in Advances in Applied and Computational Mathematics, Nova Science Publishers, pp. 153-170.
Biomedical Informatics: Fully Refereed Conference Articles
47. Zhang, P., Obradovic, Z. (2010) “Unsupervised Integration of Multiple Protein Predictors,” Proc. IEEE International Conference on Bioinformatics and Biomedicine, Hong Kong.
48. Li, A., Xie, H., Obradovic, Z., Smith, D.J, Megalooikonomou, V. (2010) “Identify Gene Functions using Functional Expression Profiles obtained by Voxalation,” ACM International Conference on Bioinformatics and Computational Biology, Niagara Falls.
49. Li, A., Obradovic, Z., Smith, D.J., Bodenreider, O., Megalooikonomou, V. (2009) „Mining Association Rules among Gene Functions in Clusters of Similar Gene Expression Maps,” Proc. Workshop on Data Mining in Functional Genomics at the IEEE International Conference on Bioinformatics and Biomedicine, Washington D.C., November 2009.
50. Midic U, Dunker A.K., and Obradovic, Z. (2009) “Protein Sequence Alignment and Intrinsic Disorder: A Substitution Matrix for an Extended Alphabet,” Proc. Workshop on Statistical and Relational Learning and Mining in Bioinformatics at the 15th ACM SIGKDD Conf. Knowledge Discovery and Data Mining, Paris, France, June 2009.
51. Gao, J., Agrawal, G.K., Thelen, J.J., Obradovic, Z., Dunker, A.K., Xu, D. “A New Machine Learning Approach for Protein Phosphorylation Site Prediction in Plants.,” Lecture Notes in Bioinformatics (LNBI 5462), Proc. First Int’l Conf. on Bioinformatics and Computational Biology (BICoB), April 2009, New Orleans, USA, pp. 18-29.
52. Ren, S. and Obradovic, Z. (2008) “Improvement of Survival Prediction from Gene Expression Profiles by Mining of Prior Knowledge,” Proc. IEEE Int’l Conf. on Bioinformatics and Biomedicine, Philadelphia, Nov. 2008.
53. An L., Xie H., Chin M., Obradovic Z., Smith D., Megalooikonomou V., (2008) “Analysis of Multiplex Gene Expression Maps Obtained by Voxelation”Proc. IEEE Int’l Conf. on Bioinformatics and Biomedicine, Philadelphia, Nov. 2008.
54. Dunker, K., Oldfield, C.J., Meng, J. Romero, P., Yang, J., Obradovic, Z. and Uversky, V.N. (2007) “Intrinsically Disordered Proteins: An Update,” Proc. IEEE 7th Int’l Symp. Bioinformatics and Bioengineering, Harvard Medical School, Cambridge, MA, pp. 49-58.
55. Midic, U., Dunker, K. and Obradovic, Z. (2005) “Improving Protein Secondary-Structure Prediction by Predicting Ends of Secondary-Structure Segments,” Proc. 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, San Diego, CA, pp. 490-497.
56. Peng, K, Vucetic, S. and Obradovic, Z. (2005) “Correcting Sampling Bias in Structural Genomics through Iterative Selection of Underrepresented Targets,” Proc. 5th SIAM Int'l Conf. on Data Mining, Newport Beach, CA, pp.621-625.
57. Xie, H., Vucetic, S., Sun, H., Hedge, P and Obradovic, Z. (2004) “Characterization of Gene Functional Expression Profiles of Plasmodium Falciparum,” Proc. 5th Conf. on Critical Assessment of Microarray Data Analysis, Durham, North Carolina.
58. Radivojac, P., Obradovic, Z., Dunker, A.K. and Vucetic, S. (2004) “Feature Selection Filters Based on Permutation Test,” Proc. 15th European Conference on Machine Learning, Pisa, Italy.
59. Peng, K., Obradovic, Z. and Vucetic, S., (2004) “Towards Efficient Learning of Neural Network Ensembles from Arbitrarily Large Datasets,” Proc. 16th European Conf. on Artificial Intelligence, Valencia, Spain, pp. 623-627.
60. Pokrajac, D., Lazarevic, A., Singleton, T. and Obradovic, Z. (2004) “Localized Neural Network Based Distributional Learning for Knowledge Discovery in Protein Databases,” Proc. Int’l Joint Conf. Neural Networks, Budapest, Hungary.
61. Peng, K., Obradovic, Z. and Vucetic, S., (2004) “Exploring Bias in the Protein Data Bank Using Contrast Classifiers,” Proc. 9th Pacific Symposium on Biocomputing, Hawaii, pp. 435-446.
62. Kontos, D., Megalooikonomou, V., Pokrajac, D., Lazarevic, A., Obradovic, Z., Ford, J., Makedon, F. and Saykin, A.J. (2004) “Extraction of Discriminative Functional MRI Activation Patterns and an Application to Alzheimer’s Disease,” Proc. 7th Int’l Conf. on Medical Image Computing and Computer-Assisted Intervention, Lecture Notes in Computer Science series, Springer, Saint-Malo, France, Lecture Notes in Computer Science 3217, Vol. 2, pp. 727-735.
63. Peng, K., Vucetic, S., Han, B., Xie H. and Obradovic, Z. (2003) “Exploiting Unlabeled Data for Improving Accuracy of Predictive Data Mining," Proc. 3rd IEEE Int’l Conf. Data Mining, Melbourne, Fl, pp. 267-274.
64. Han, B., Vucetic, S. and Obradovic, Z. (2003) “Reranking Medline Citations by Relevance to a Difficult Biological Query," Proc. IASTED Int'l Conf. Neural Networks and Computational Intelligence, Cancun, Mexico, pp. 38-43.
65. Vucetic, S., Pokrajac, D., Xie H. and Obradovic, Z. (2003) “Detection of Underrepresented Biological Sequences Using Class-Conditional Distribution Models," Proc. Third SIAM Int'l Conf. on Data Mining, San Francisco, CA, pp. 279-283.
66. Radivojac, P., Obradovic, Z., Brown, C.J. and Dunker, A.K. (2003) “Prediction of Boundaries Between Intrinsically Ordered and Disordered Protein Regions,” Proc. 8th Pacific Symposium on Biocomputing, Hawaii, pp. 216-227.
67. Radivojac, P., Obradovic, Z., Brown, C.J. and Dunker, A.K. (2002) “Improving Sequence Alignments for Intrinsically Disordered Proteins," Proc. 7th Pacific Symposium on Biocomputing, Hawaii, pp. 589-600.
68. Dunker, A.K., Brown. C.J, Lawson, J.D., Iakoucheva-Sebat, L.M., Vucetic, S. and Obradovic, Z. (2002) “The Protein Trinity: Structure/Function Relationships that Include Intrinsic Disorder,” Proc. 2002 Miami Nature Biotechnology Winter Symp., The Scientific Word, 2(S2), 49-50.
69. Megalooikonomou, V., Pokrajac, D., Lazarevic, A., and Obradovic, Z. (2002) “Effective Classification of 3D Image Data using Partitioning Methods," Proc. SPIE Visualization and Data Analysis 2002 Conf., San Jose, CA, pp. 62-73.
70. Vucetic, S., Radivojac, P., Dunker, A.K., Brown, C.J. and Obradovic, Z. (2001) “Methods for Improving Protein Disorder Prediction," Proc. 2001 IEEE/INNS International Joint Conference on Neural Networks, Washington D.C., vol. 4, pp. 2718-2723. ISBN: 0-7803-7044-9
71. Williams, R.M., Obradovic, Z., Mathura, V., Braun, W., Garner, E.C., Young, J., Takayama, S., Brown, C.J. and Dunker A.K. (2001) “The Protein Non-Folding Problem: Amino Acid Determinants of Intrinsic Order and Disorder," Proc. 6th Pacific Symposium on Biocomputing, Maui, Hawaii, pp. 89-100.
72. Lazarevic, A., Pokrajac, D., Megalooikonomou, V. and Obradovic, Z. (2001) “Distinguishing Among 3-D Distributions for Brain Image Data Classification," Proc. 4th International Conference of Neural Networks and Expert Systems in Medicine and Health Care, Milos Island, Greece, pp. 389-396.
73. Pokrajac, D., Lazarevic, A., Megalooikonomou, V. and Obradovic, Z. (2001) “Classification of Brain Image Data using Measures of Distributional Distance," Human Brain Mapping, Brighton, UK.
74. Dunker, A.K., Obradovic, Z., Romero, P., Garner, E.C and Brown, C.J. (2000) “Intrinsic Protein Disorder in Complete Genomes," In S. Miyano and T. Takagi (editors) Proc. Genome Informatics 11, Tokyo, Japan, pp. 161-171.
75. Li, X., Obradovic, Z., Brown, C. J., Garner, E. C., Keith A. K. (2000) “Comparing Predictors of Disordered Protein," In S. Miyano and T. Takagi (editors) Proc. Genome Informatics 11, Tokyo, Japan, pp. 172-184.
76. Li, X., Rani, M., Romero, P., Obradovic, Z. and Dunker, A.K. (1999) “Predicting Protein Disordered Regions for N-, C- and Internal Regions," In S. Miyano & T. Takagi (editors) Proc. Genome Informatics 10, Tokyo, Japan, pp. 30-40.
77. Garner, E., Romero, P., C.J. Brown, Obradovic, Z. and Dunker, A.K. (1999) “Predicting Binding Regions within Disordered Proteins," In S. Miyano & T. Takagi (editors) Proc. Genome Informatics 10, Tokyo, Japan, pp. 41-50.
78. Xie, Q., Arnold, G.E., Romero, P., Obradovic, Z., Garner, E and Dunker, A.K. (1998) “The Sequence Attribute Method for Determining Relationships Between Sequence and Protein Disorder," In S. Miyano & T. Takagi (editors) Proc. Genome Informatics 1998, Tokyo, Japan, pp. 193-200.
79. Garner, E., Cannon, P., Romero, P., Obradovic, Z. and Dunker, A.K. (1998) “Predicting Disordered Regions from Amino Sequence: Common Theme Despite Differing Structural Characterization," In S. Miyano & T. Takagi (editors) Proc. Genome Informatics 1998,Tokyo, Japan, pp. 201-213.
80. Rani, M., Romero, P., Obradovic, Z. and Dunker, A.K. (1998) “Annotation of PDB with respect to Disordered Regions in Proteins," In S. Miyano & T. Takagi (editors) Proc. Genome Informatics 1998, Tokyo, Japan, pp. 240-241.
81. Romero, P., Obradovic, Z., Kissinger, C., Villafranca, J.E., Garner, E., Guilliot, S. and Dunker, A.K. (1998) “Thousands of Proteins Likely to Have Long Disordered Regions," Proc. Pacific Symposium on Biocomputing, Maui, Hawaii, vol. 3, pp. 435-446.
82. Dunker, A.K., Garner E., Guilliot S., Romero P., Albrecht K., Hart J., Obradovic Z., Kissinger C., and Villafranca, J.E., (1998) “Protein Disorder and the Evolution of Molecular Recognition: Theory, Predictions and Observations," Proc. Pacific Symposium on Biocomputing, Maui, Hawaii, vol. 3, pp. 471-482.
83. Romero, P., Obradovic, Z and Dunker A.K. (1997) “Sequence Data Analysis for Long Disordered Regions Prediction in the Calcineurin Family," In S. Miyano & T. Takagi (editors) Proc. Genome Informatics 1997, Tokyo, Japan, pp. 110-125.
84. Romero, P., Obradovic, Z., Kissinger, C., Villafranca, J.E. and Dunker, A.K. (1997) “Identifying Disordered Regions in Proteins from Amino Acid Sequence," Proc. IEEE Int. Conf. on Neural Networks, Houston, TX, vol. 1, pp. 90-95.
II. DATA MINING: Spatial and Spatio-Temporal Data Mining: Journal Articles
85. Ouzienko, V., Guo, Y., Obradovic, Z. (in review) “A Decoupled Exponential Random Graph Model for Prediction of Structure and Attributes in Temporal Social Networks.”
86. Mennis, J., Harris, P., Obradovic, Z., Izenman, A., Grunwald, H., and Lockwood, B., (in press) “The effect of neighborhood characteristics and spatial spillover on urban juvenile delinquency and recidivism,” The Professional Geographer.
87. Radosavljevic, V., Vucetic, S., Obradovic, Z. (2010) “A Data Mining Technique for Aerosol Retrieval Across Multiple Accuracy Measures,” IEEE Geoscience and Remote Sensing Letters, vol. 7, no.2, pp. 411-415.
88. Vucetic, S., Han, B., Mi, W., Li. Z., Obradovic, Z. (2008) “A Data Mining Approach for the Validation of Aerosol Retrievals,” IEEE Geoscience and Remote Sensing Letters, vol. 5, no. 1, pp. 113-117.
89. Han, B., Vucetic, S., Braverman, A. and Obradovic, Z. (2006) “A Statistical Complement to Deterministic Algorithms for Retrieving Aerosol Optical Thickness from Radiance Data,” Engineering Applications of Artificial Intelligence, vol. 19, no. 7, pp. 787-795.
90. Pokrajac, D., Obradovic, Z. (accepted with minor revisions) “Spatial-Temporal Prediction with Partial Attribute Observability," Computers and Geoscience.
91. Pokrajac, D., Hoskinson, R.L. and Obradovic, Z. (2003) “Modeling Spatial-Temporal Data with a Short Observation History," Knowledge and Information Systems. Vol. 5, pp. 368-386.
92. Pokrajac, D., Fiez, T. and Obradovic, Z. (2002) “A Data Generator for Evaluating Spatial Issues in Precision Agriculture," Precision Agriculture. Vol 3, no.3, pp. 259-282.
93. Lazarevic, A. and Obradovic, Z. (2001) “Adaptive Boosting Techniques in Heterogeneous and Spatial Databases," Intelligent Data Analysis, Vol. 5, pp.1-24.
94. Vucetic, S., Fiez, T. and Obradovic, Z. (2000) “Analyzing the Influence of Data Aggregation and Sampling Density on Spatial Estimation," Water Resources Research, Vol. 36 , No. 12 , pp. 3721-3731.
Spatial and Spatio-Temporal Data Mining: Peer Reviewed Book Chapters
95. Han, B., Obradovic, Z. and Vucetic, S. (2008) “Using Statistical Methods to Improve Efficiency and
Accuracy of Aerosol Retrievals,” Chapter 7 in Discrete and Computational Mathematics, Nova Science Publishers, Editors: F. Liu, Gaston M. N'Guerekata, D. Pokrajac, X. Shi, J. Sun, X. Xia, pp. 93-106.
Spatial and Spatio-Temporal Data Mining: Fully Refereed Conference Articles
96. Ouzienko, V. Obradovic, Z., (in review) “Imputation of Missing Links and Attributes in Longitudinal Social Networks.”
97. Radosavljevic, V., Vucetic, S., Obradovic, Z., (in review) “Cooperative Continuous Conditional Random Fields for Structured Prediction.”
98. Lou, Q., Obradovic, Z. (in review) “Modeling Multivariate Spatio-Temporal Data with Large Gaps.” 99. Mathew, G., Obradovic, Z. (2011) “A Privacy-preserving Framework for Distributed Clinical
Decision Support,” Proc. IEEE International Conference on Computational Advances in Bio and Medical Sciences, Orlando, Florida.
100. Jun, G., Ghosh, J., Radosavljevic, V., Obradovic, Z. (2010) “Predicting Ground-Based Aerosol Optical Depth with Satelite Images via Gausian Processes,” Proc. International Conference on Knowledge Discovery and Information Retrieval, Valencia, Spain.
101. Obradovic, Z., Das, D., Radosavljevic, V., Ristovski, K., Vucetic, S. (2010) “Spatio-Temporal Characterization of Aerosols through Active Use of Data from Multiple Sensors,” Proc. International Society for Photogrammetry and Remote Sensing (ISPRS) Technical Commission VII Symposium, July 5-7, Vienna, Austria, ISPRS International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences.
102. Radosavljevic, V., Obradovic, Z., Vucetic, S. (2010) “Continuous Conditional Random Fields for Regression in Remote Sensing,” Proc. 19th
103. Ouzienko, V., Guo, Y., Obradovic, Z. (2010) “Prediction of Attributes and Links in Temporal Social Networks,” Proc. 19
European Conf. on Artificial Intelligence, August, Lisbon, Portugal.
th
104. Ristovski, K., Das, D. Ouzienko, V., Guo, Y., Obradovic, Z. (2010) “Regression Learning with Multiple Noisy Oracles,” Proc. 19
European Conf. on Artificial Intelligence, August, Lisbon, Portugal.
th
105. Lou, Q., Obradovic, Z. (2010) “Feature Selection by Approximating the Markov Blanket in a Kernel-Induced Space,” Proc. 19
European Conf. on Artificial Intelligence, August, Lisbon, Portugal.
th European Conf. on Artificial Intelligence, August, Lisbon, Portugal.
106. Das, D., Obradovic, Z., Vucetic, S. (2009) “Active Selection of Sensor Sites in Remote Sensing Applications,” Proc. IEEE International Conference on Data Mining, December, Miami, FL. pp. 758-763.
107. Radosavljevic, V., Vucetic, S., Obradovic, Z. (2009) “Reduction of Ground-Based Sensor Sites for Spatio-Temporal Analysis of Aerosols,” Proc. 3rd International Workshop on Knowledge Discovery from Sensor Data at the 15th
108. Ristovski, K., Vucetic, S., Obradovic, Z. (2009) “Evaluation of a Neural Networks based Approach for Aerosol Optical Depth Retrieval and Uncertainty Estimation,” Proc. Int’l Conf. on Space Technology, Thessaloniki, Greece, Aug. 2009.
ACM SIGKDD Conf. Knowledge Discovery and Data Mining, Paris, France, June 2009.
109. Ayuyev, V., Jupin, J., Harris, P. and Obradovic, Z. (2009) “Dynamic Clustering Based Estimation of Missing Values in Mixed Type Data,” Proc. 11th
110. Das, D., Radosavljevic, V., Vucetic, S., Obradovic, Z. (2008) “Reducing Need for Collocated Ground and Satellite based Observations in Statistical Aerosol Optical Depth Estimation,” IEEE Int’l Geoscience and Remote Sensing Symposium, July, Boston, MA.
Int’l Conf. on Data Warehousing and Knowledge Discovery, Linz, Austria, Sept. 2009, pp. 366-377.
111. Radosavljevic, V., Vucetic, S., Obradovic, Z. (2008) “Spatio-Temporal Partitioning for Improving Aerosol Prediction Accuracy,” Proc. Eight SIAM Int’l Conf. on Data Mining, April 24-26, 2008, Atlanta, GA, USA.
112. Zhuang, W., Radosavljevic, Han, B., Obradovic, Z., Vucetic, S. (2008) “Aerosol Optical Depth Prediction from Satellite Observations by Multiple Instance Regression,” Proc. Eight SIAM Int’l Conf. on Data Mining,, Atlanta, GA, USA, 2008.
113. Radosavljevic, V., Vucetic, S., Obradovic, Z. (2007) “Aerosol Optical Depth Retrieval by Neural Network Ensembles with Adaptive Cost Function,” Proc. 10th Int’l Conf. Engineering Applications of Neural Networks,” Thessaloniki, Greece, Aug. 2007, pp. 266-275.
114. Han, B., Obradovic, Z, Li, Z. and Vucetic, S., (2006) “Data Mining Support for Improvement of MODIS Aerosol Retrievals,” Proc. IEEE Int’l Geoscience and Remote Sensing Symp., Denver, CO, Aug. 2006.
115. Obradovic, Z, Han, B., Xu, Q., Li, Y., Braverman, A., Li, Z. and Vucetic, S. (2006) “Data Mining Support for Aerosol Retrieval and Analysis – Project Summary,” NASA Data Mining Workshop, Pasadena, CA, May 2006.
116. Qin, Y. and Obradovic, Z. (2006) “Efficient Learning from Massive Spatial-Temporal Data through Selective Support Vector Propagation,” Proc. 17th European Conf. on Artificial Intelligence, Riva Del Garda, Italy.
117. Han, B., Vucetic, S., Braverman, A. and Obradovic, Z (2005) “Integration of Deterministic and Statistical Algorithms for Aerosol Retrieval,” Proc. International Conference on Novel Applications of Neural Networks in Engineering, Lillie, France, Aug. 2005, pp. 85-92.
118. Han, B., Vucetic, S., Braverman, A. and Obradovic, Z. (2005) “Construction of an accurate geospatial predictor by fusion of global and local models,” Proc. IEEE 8th International Conference on Information Fusion, B.11.2 pp. 1-8, Philadelphia, PA, July 2005.
119. Xu, Q., Han, B., Li, Y., Braverman, A., Obradovic, Z. and Vucetic, S. (2005) “Improving aerosol retrieval performance by integrating AERONET, MISR, and MODIS data products,” Proc. IEEE 8th International Conference on Information Fusion, B.11.3 pp. 1-8, Philadelphia, PA, July 2005.
120. Pokrajac, D., Hoskinson, R., Lazarevic, A., Obradovic, Z. (2002) “Spatial-Temporal Techniques for Prediction and Compression of Soil Fertility Data," Proc. 6th International Conference on Precision Agriculture, Minneapolis, MN.
121. Hoskinson, R., Pokrajac, D., Obradovic, Z., Lazarevic, A. (2002) “The Unpredictability of Soil Fertility across Space and Time," Proc. 6th International Conference on Precision Agriculture, Minneapolis, MN.
122. Pokrajac, D. and Obradovic, Z. (2001) “Improved Spatial-Temporal Forecasting through Mining,” Proc. First SIAM Int’l Conf. on Data Mining,, April 5-7, 2001, Chicago, USA.
123. Vucetic S. and Obradovic Z. (2000) “Discovering Homogeneous Regions in Spatial Data through Competition," Machine Learning: Proc. of the 17th Int'l. Conf., Stanford, CA, June 2000, pp. 1095-1102.
124. Pokrajac D. and Obradovic Z. (2000) “Combining Regressive and Auto-Regressive Models for Spatial-Temporal Prediction," Machine Learning of Spatial Knowledge Workshop at the 17th Int'l. Conf. on Machine Learning, Stanford, CA, June 2000.
125. Pokrajac, D. and Obradovic, Z. (2000) “Learning Heterogeneous Functions from Sparse and Non-Uniform Samples," Proc. IEEE-INNS-ENNS Int'l Joint Conf. on Neural Networks, Como, Italy, July 2000.
126. Pokrajac, D., Obradovic, Z. and Fiez, T. (2000) “Understanding the Influence of Noise, Sampling Density and Data Distribution on Spatial Prediction Accuracy," Track on Simulation Methodology and Control Engineering and Artificial Intelligence, R. V. Landeghem (Ed.): Proc. 14th European Simulation Multiconference - Simulation and Modeling: Enablers for a Better Quality of Life, May 23-26, 2000, Ghent, Belgium. SCS Europe 2000, ISBN 1-56555-204-0, pp. 706-708.
127. Pokrajac, D., Fiez, T. and Obradovic, Z. (2000) “A Tool for Controlled Knowledge Discovery in Spatial Domains," Track on Simulation Methodology, Tools and Standards, R. V. Landeghem (Ed.): Proc. 14th European Simulation Multiconference - Simulation and Modeling: Enablers for a Better Quality of Life, May 23-26, 2000, Ghent, Belgium. SCS Europe 2000, ISBN 1-56555-204-0, pp. 26-32.
128. Lazarevic, A. Fiez, T. and Obradovic, Z. (2000) “Adaptive Boosting for Spatial Functions with Unstable Driving Attributes," Proc. Pacific-Asia Conference on Knowledge Discovery and Data Mining, Kyoto, Japan, April 2000, Computer Science Editorial 3, Springer-Verlag, pp. 329-340.
129. Lazarevic, A. Fiez, T. and Obradovic, Z. (2000) “A Software System for Spatial Data Analysis and Modeling," Proc. Data Mining Minitrack at the IEEE Hawaii Int'l Conf. On System Sciences, IEEE Computer Society Press, January 2000.
130. Pokrajac, D., Lazarevic, A., Vucetic, S., Fiez T. and Obradovic Z. (1999) “Image Processing in Precision Agriculture," Proc. IEEE Int'l Conf. on Telecommunications in Modern Satellite, Cable and Broadcasting Services, Nis, Yugoslavia, October 1999, IEEE Press, v.2, pp. 616-619.
131. Pokrajac, D., Fiez, T., Obradovic, D., Kwek, S. and Obradovic, Z. (1999) “Distribution Comparison for Site-Specific Regression Modeling in Agriculture,' Proc. IEEE/INNS Int'l Joint Conf. on Neural Networks, IEEE Press, ISBN 0-7803-5532-6, Washington D.C., July 1999, No. 346, Session 10.9.
132. Lazarevic, A., Xu, X., Fiez, T. and Obradovic, Z. (1999) “Clustering-Regression-Ordering Steps for Knowledge Discovery in Spatial Databases," Proc. IEEE/INNS Int'l Joint Conf. on Neural Networks, IEEE Press, ISBN 0-7803-5532-6, Washington D.C., July 1999, No. 345, Session 8.1B.
133. Vucetic, S., Fiez, T. and Obradovic, Z. (1999) “A Data Partitioning Scheme for Spatial Regression," Proc. IEEE/INNS Int'l Joint Conf. on Neural Networks, IEEE Press, ISBN 0-7803-5532-6,Washington D.C., July 1999, No. 348, Session 8.1A.
Spatial and Spatio-Temporal Data Mining: Invited Articles
134. Drossu, R., Fiez, T., Lazarevic, A., Pokrajac, D., Vucetic, S., and Obradovic, Z. (1998) “Use of Terrain Analysis in Yield Map Interpretation," Geographical Information Systems in Agriculture Conference, Orlando, Florida. Parallel and Distributed Data Mining
Predictive Data Mining Methods: Journal Articles:
135. Delibasic, B., Jovanovic, M., Vukicevic, M., Suknovic, M., Obradovic, Z. (in press) “Component-based decision trees for classification,” Intelligent Data Analysis, vol. 15 (5).
136. Suknovic, M., Delibasic, B., Jovanovic, M., Vukicevic, M., Becajski-Vujaklija, D., Obradovic, Z. (in press) “Reusable components in decision trees induction algorithms,” Computational Statistics.
137. Jones, P.R., Schwartz, D., Schwartz, I.M., Obradovic, Z., Jupin, J., (2007) “Risk Classification and Juvenile Dispositions: What is the State of the Art?” Temple Law Review, vol. 79, no. 2., pp. 461-498.
138. Vucetic, S. and Obradovic, Z. (2005) “Collaborative Filtering Using a Regression-Based Approach," Knowledge and Information Systems, Vol. 7, No. 1, pp. 1-22.
139. Pokrajac, D., Lazarevic, A. and Obradovic, Z. (2001) “Exploration-Exploitation Trade-Off in Machine Learning," Facta Universitatis, Ser. Elec. and Energ., vol. 14, no. 1, pp. 67-90.
Predictive Data Mining Methods: Peer Reviewed Book Chapters:
140. Schwartz, I.M., Jones, P.R., Schwartz, D., Obradovic, Z. (2008) “Improving Social Work Through the Use of Technology and Advanced Research Methods,” chapter 13 in Child Welfare Research: Advances for Practice and Policy (eds. Lindsey, D. and Shlonsky, A.) Oxford, pp. 214-230.
141. Obradovic, Z. and Vucetic, S. (2004) “Challenges in Scientific Data Mining: Heterogeneous, Biased, and Large Sample,” a peer reviewed book chapter at The Next Generation Data Mining (editors: H. Kargupta, A. Joshi, K. Sivakumar, Y. Yesha), AAAI/MIT Press, pp. 381-401.
Predictive Data Mining Methods: Fully Refereed Conference Articles:
142. Mathew, G. and Obradovic, Z. (2010) “Vocabularies in Collaboration Channels,” Proc. IEEE 6th
143. Song, M., Song, I.Y., Allen, R.B and Obradovic, Z. (2006) “Improving Retrieval Performance by Automatic Query Expansion with Keyphrases and POS Phrase Categorization, Proc. 6
Int.’l Conf. on Collaborative Computing: Networking, Applications and Worksharing, Chicago, IL.
th
144. Radivojac, P., Sivalingam, K. and Obradovic, Z. (2003) “Learning from Class-Imbalanced Data in Wireless Sensor Networks,” Proc. IEEE Semiannual Vehicular Technology Conference Fall 2003, Orlando, Fl.
ACM/IEEE-CS Joint Conf. Digital Libraries, Chapel Hill, NC.
145. Vucetic, S. and Obradovic, Z. (2001) “Classification on data with biased class distribution," Proc. 12th European Conf. on Machine Learning, Freiburg, Germany, pp. 527-538.
146. Vucetic S. and Obradovic Z. (2000) “A Regression-Based Approach for Scaling-Up Personalized Recommender Systems in E-Commerce," Web Mining for E-Commerce Workshop at the Sixth ACM SIGKDD Inl'l Conf. on Knowledge Discovery and Data Mining, Boston, MA.
147. Vucetic, S. and Obradovic, Z. (2000) “Performance Controlled Data Reduction for Knowledge Discovery in Distributed Databases," Proc. Pacific-Asia Conference on Knowledge Discovery and Data Mining, Kyoto, Japan, April 2000, Computer Science Editorial 3, Springer-Verlag, pp. 29-39.
148. Obradovic, D. and Obradovic Z. (1999) “Efficient Probability Density Balancing for Supporting Distributed Knowledge Discovery in Large Databases," Proc. IEEE/INNS Int'l Joint Conf. on Neural Networks, IEEE Press, ISBN 0-7803-5532-6, Washington D.C., No. 347, Session 8.1B.
Parallel and Distributed Data Mining: Journal Articles
149. Lazarevic, A. and Obradovic, Z. (2002) “Knowledge Discovery in Multiple Spatial Databases," Neural Computing and Applications, vol 10. no. 4, pp. 339-350.
150. Lazarevic, A. and Obradovic, Z. (2002) “Boosting Algorithms for Parallel and Distributed Learning," Distributed and Parallel Databases: An International Journal, Special Issue on Parallel and Distributed Data Mining, vol. 2, pp. 203-229.
151. Obradovic, Z. and Mehr, I., (1996) “Parallel Neural Network Learning Through Repetitive Bounded Depth Trajectory Branching," Neural, Parallel and Scientific Computations, vol. 4, no. 4, pp. 475-491.
Parallel and Distributed Data Mining: Fully Refereed Conference Articles
152. Lazarevic, A. and Obradovic, Z. (2001) “Data Reduction using Multiple Models Integration," Principles of Knowledge Discovery in Databases, Proc. 5th European Conf., Freiburg, Germany, pp. 301-313.
153. Lazarevic, A. and Obradovic, Z. (2001) “The Distributed Boosting Algorithm," Proc. 7th
154. Lazarevic, A. and Obradovic, Z. (2001) “The Effective Pruning of Neural Network Ensembles," Proc. 2001 IEEE/INNS International Joint Conference on Neural Networks, Washington D.C., pp. 796-801.
ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining, San Francisco, CA, pp. 311-316.
155. Lazarevic, A. and Obradovic, Z. (2001) “Boosting Localized Classifiers in Heterogeneous Databases," Proc. First SIAM Int'l Conf. on Data Mining, April 5-7, Chicago, USA.
156. Lazarevic, A., Pokrajac, D., and Obradovic, Z. (2000) “Distributed Clustering and Local Regression for Knowledge Discovery in Multiple Spatial Databases," Proc. 8th European Symposium on Artificial Neural Networks, Bruges, Belgium, April 2000, pp. 129-134.
157. Venkateswaran, R. and Obradovic, Z., (1994) “Efficient Learning through Cooperation," Proc. World Congress on Neural Networks, San Diego, CA, vol. 3, pp. 390-395.
158. Mehr, I., and Obradovic, Z., (1994) “Parallel Neural Network Learning Through Repetitive Bounded Depth Trajectory Branching," Proc. 8th IEEE Int. Parallel Processing Symposium, Cancun, Mexico, pp. 784-791.
159. Fletcher, J. and Obradovic, Z., (1993) “Parallel Constructive Neural Network Learning," Proc. 2nd IEEE Int. Symp. on High-Performance Distributed Computing, Spokane, WA, pp. 174-178.
Parallel and Distributed Data Mining: Invited and Non-referees Articles
160. Lazarevic, A., Pokrajac, D. and Obradovic, Z. (2000) “An E-commerce System for Mining Distributed Spatial Databases," Int'l Conf. on Advances in Infrastructure for Electronic Business, Science, and Education on the Internet, L'Aquila, Italy, August 2000 (by invitation conference).
161. Mehr, I., Obradovic, Z. and Venkateshwaran, R., (1994) “Parallel and Distributed Gradient Descent Learning," Notes of the Neural Networks Workshop for the Hanford Community, Pacific Northwest Laboratory, Richland, WA, pp. 31-38.
Time Series Analysis: Journal Articles:
162. Vucetic, S., Obradovic, Z. and Tomsovic, K. (2001) “Price-Load Relationships in California's Electricity Market," IEEE Trans. on Power Systems, Vol. 16, No. 2, pp. 280-286.
163. Obradovic, Z. and Chenoweth, T., (1996) “Selection of Learning Algorithms for Trading Systems Based on Biased Estimators." Heuristics, The Journal of Intelligent Technologies, vol. 9, no. 1, pp. 9-21.
164. Chenoweth, T., Obradovic, Z. and Lee, S., (1996) “Embedding Technical Analysis into Neural Network Based Trading Systems," Applied Artificial Intelligence, Taylor & Francis, Washington D.C., vol. 10, no. 6., pp. 523-541.
165. Drossu, R. and Obradovic, Z., (1996) “Regime Signaling Techniques for Non-stationary Time Series Forecasting." Journal of Computational Intelligence in Finance, Finance & Technology Publishing, vol. 4, no. 5, pp. 7-15.
166. Drossu, R. and Obradovic, Z., (1996) “Rapid Design of Neural Networks for Time Series Prediction," IEEE Computational Science and Engineering, vol. 3, no. 2, pp. 78-89.
167. Chenoweth, T. and Obradovic, Z., (1996) “A Multi-Component Nonlinear Prediction System for the S&P 500 Index," Neurocomputing, vol. 10, no. 3, pp. 275-290.
168. Chenoweth, T. and Obradovic, Z., (1995) “An Explicit Feature Selection Strategy for Predictive Models of the S&P 500 Index," Journal of Computational Intelligence in Finance, Finance & Technology Publishing, vol. 3, no. 6, pp. 14-21.
169. Perez, L.G., Flechsig, A.J., Meador, J.L. and Obradovic, Z., (1994) “Training an Artificial Neural Network to Discriminate Between Magnetizing Inrush and Internal Faults," IEEE Trans. on Power Delivery, vol. 9, no. 1., pp. 434-441.
Time Series Analysis: Peer Reviewed Book Chapters
170. Drossu, R. and Obradovic, Z. (2000) “Data Mining Techniques for Designing Efficient Neural Network Time Series Predictors," peer reviewed book chapter no. 10 in Cloete, I. and Zurada, J. Knowledge-Based Neurocomputing, MIT Press, ISBN 0-262-03274-0, pp. 325-368.
171. Drossu, R. and Obradovic, Z. (1997) “An Analysis of the INFFC Cotton Futures Time Series: Lower Bounds and Testbed Design Recommendations," in Caldwell, B. R., (editor) Nonlinear Financial Forecasting: The First Nonlinear Financial Forecasting Competition, Finance & Technology Publishing, pp. 241-261.
172. Drossu, R. and Obradovic, Z. (1996) “Prediction Horizon Effects on Stochastic Modeling Hints for Neural Networks," In P.E. Keller, S.Hashem, L.J. Kangas, and R. T. Kouzes (editors) Applications of Neural Networks in Environment, Energy, and Health
173. Drossu, R., Lakshman, T.V., Obradovic, Z. and Raghavendra C.S., (1995) “Single and Multiple Frame Video Traffic Prediction Using Neural Network Models," In Raghavan S.V. and Jain B.N. (editors) Computer Networks, Architecture and Applications, Chapman & Hall, 1995, chapter 9, pp. 146-158.
Time Series Analysis: Fully Refereed Conference Articles
174. Vucetic, S. and Obradovic, Z. (2000) “A Constructive Competitive Regression Method for Analysis and Modeling of Non-stationary Time Series," Proc. the First Int'l Workshop on Computational Intelligence in Economics and Finance at the Fifth Int'l Conf. On Information Science, Atlantic City, N.Y., USA, vol. 2, pp. 978-981.
175. Drossu, R. and Obradovic, Z., (1997) “INFFC Data Analysis: Lower Bounds and Testbed Design Recommendations," Proc. 1997 Computational Intelligence in Financial Engineering, New York, N.Y., pp. 71-74.
176. Drossu, R. and Obradovic, Z., (1997) “Regime Signaling Techniques for Non-stationary Time Series Forecasting," Proc. Chaotic and Complex Systems Minitrack at the 30-th Hawaii Int'l Conf. on System Sciences, IEEE Computer Society Press, vol. 5, pp. 530- 538.
177. Drossu, R. and Obradovic, Z., (1995) “Novel Results on Stochastic Modelling Hints for Neural Network Prediction," Proc. World Congress on Neural Networks, Washington, D.C., vol. 3, pp. 230-233.
178. Drossu, R. and Obradovic, Z., (1995) “Stochastic Modeling Hints for Neural Network Prediction," Proc. World Congress on Neural Networks, Washington, D.C., vol. 2, pp. 16-19 and 88-91.
179. Drossu, R. and Obradovic, Z., (1995) “Prediction Horizon Effects on Stochastic Modeling Hints for Neural Networks," Proc. the Workshop on Environmental and Energy Applications of Neural Networks, Pacific Northwest Laboratory, Richland, WA, World Scientific Publishing.
Time Series Analysis: Lightly Refereed Conference Articles
180. Obradovic, Z. and Chenoweth, T. (1996) “Selection of Learning Algorithms for Trading Systems Based on Biased Estimators - An Abstract," Working Notes of the 1996 AAAI Workshop on Integrating Multiple Learned Models for Improving and Scaling Machine Learning Algorithms, held in conjunction with National Conference on Artificial Intelligence AAAI, Portland, OR, pp. 93-94.
181. Obradovic, Z. and Vucetic, S. (1999) “Time Series Method for Forecasting Electricity Market Pricing" in Intelligent Systems in Electricity Market Modeling session, IEEE Power Engineering Society 1999 Summer Meeting, Edmonton, Canada.
III. KNOWLEDGE SYSTEMS:
Hybrid Knowledge Systems: Journal Articles
182. Obradovic, Z. and Srikumar, R. (2001) “Parallelizing Design of Application Tailored Neural Networks," Facta Universitatis, Ser. Mathematics and Informatics, vol. 16, pp. 97-108.
183. Obradovic, Z. and Srikumar, R. (2000) “Constructive Neural Networks Design Using Genetic Optimization," Facta Universitatis, Ser. Mathematics and Informatics, vol. 15, pp. 133-146
184. Obradovic, Z. (1997) “Guest Editorial: Hybrid Intelligence for Financial Forecasting,” NeuroVest Journal, vol. 5, no. 1, pp. 4-5.
185. Drossu, R., Obradovic, Z. and Fletcher, J. (1996) “A Flexible Graphical User Interface for Embedding Heterogeneous Neural Network Simulators," IEEE Trans. on Education, special issue on Applications of Information Technology, volume 39, no. 3, pp. 367-374.
186. Fletcher, J. and Obradovic, Z., (1995) “A Discrete Approach to Constructive Neural Network Learning," Neural, Parallel and Scientific Computations, vol. 3, no. 3, pp. 307-320.
187. Fletcher, J. and Obradovic, Z. (1993) “Combining Prior Symbolic Knowledge and Constructive Neural Networks," Connection Science: Journal of Neural Computing, Artificial Intelligence and Cognitive Research, vol. 5, nos. 3 & 4, pp. 365-375.
Hybrid Knowledge Systems: Fully Refereed Conference Articles
188. Obradovic, Z. and Chenoweth, T. (1996) “Selection of Learning Algorithms for Trading Systems Based on Biased Estimators," Proc. Adaptive Distributed Parallel Computing Conference, Dayton, OH, pp. 458-467.
189. Romero, P. and Obradovic, Z. (1995) “Comparison of Symbolic and Connectionist Approaches to Local Experts Integration," Proc. the IEEE Technical Applications Conference at Northcon 95, Portland, OR, pp. 105-110.
190. Chenoweth, T., Obradovic, Z., and Lee, S. (1995) “Technical Trading Rules as a Prior Knowledge to a Neural Networks Prediction System for the S&P 500 Index," Proc. The IEEE Technical Applications Conference at Northcon 95, Portland, OR, pp. 111-115.
191. Chenoweth, T. and Obradovic, Z., (1995) “A Multi-Component Approach to Stock Market Prediction," Proc. 3rd
192. Chenoweth, T. and Obradovic, Z., (1994) “Feature Selection for Predictive Models of the Stock Market," Proc. 2
Int. Conf. on Artificial Intelligence on Wall Street, New York, N.Y., pp. 74-79.
nd
193. Obradovic, Z. and Fletcher, J., (1993) “Integration of Knowledge-Based and Constructive Learning Neural Networks," Proc. 1993 World Congress on Neural Networks, Portland, OR, vol. 1, pp. 589-592.
Int. Workshop Neural Networks in the Capital Market, Pasadena, CA.
194. Obradovic, Z. and Fletcher, J. (1992) “Integration of Knowledge-Based and Constructive Learning Neural Networks," Notes of the 1992 AAAI Workshop on Integrating Neural and Symbolic Processes, held in conjunction with National Conference on Artificial Intelligence AAAI, San Jose, CA.
Hybrid Knowledge Systems: Lightly Refereed Conference Articles
195. Fletcher, J. and Obradovic, Z. (1992) “Creation of Neural Networks by Hyperplane Generation from Examples Alone," Notes of the Neural Networks for Learning, Recognition, and Control Research Conference, G. A. Carpenter and S. Grossberg (eds.), the Wang Institute of Boston University.
Hybrid Knowledge Systems: Invited Articles
196. Obradovic, Z. (1997) “Guest Editorial: Hybrid Intelligence for Financial Forecasting," Journal of Computational Intelligence in Finance, vol. 5, no. 1, pp. 4-5.
197. Obradovic, Z. (1998) “Embedding Prior Knowledge to Statistical Learning Systems for Efficient Knowledge Discovery in Large Databases," Symposium on Contemporary Mathematics, Belgrade, Yugoslavia.
198. Fletcher, J. and Obradovic, Z., (1994) “Integrating a Parallel Constructive Neural Network Algorithm with an Expert System," Notes of the Neural Networks Workshop for the Hanford Community, Pacific Northwest Laboratory, Richland, WA, pp. 58-66.
Hybrid Knowledge Systems: Peer Reviewed Book Chapters
199. Romero, P., Obradovic, Z. and Fletcher J. (2000) “Integration of Heterogeneous Sources of Partial Domain Knowledge," peer reviewed book chapter no. 7 in Cloete, I. and Zurada, J. Knowledge-Based Neurocomputing, MIT Press, pp. 217-250.
Neural Networks: Journal Articles
200. Pokrajac, D., Milutinovich, J. and Obradovic, Z. (2005) “Toward Neural Network-Based Profit Optimization," Facta Universitatis, Series Economics and Organization, vol. 2, no. 3, pp. 261-275.
201. Obradovic, Z., (1996) “Computing with Nonmonotone Multivalued Neurons," Multiple Valued Logic, vol. 1, no. 4, pp. 271-294.
202. Obradovic, Z. and Parberry, I. (1994) “Learning with Discrete Multi-Valued Neurons," Journal of Computer and System Sciences, vol. 49, no. 2, pp. 375-390.
203. Obradovic, Z. and Parberry, I. (1992) “Computing with Discrete Multi-Valued Neurons," Journal of Computer and System Sciences, vol. 45, no. 3, pp. 471-492.
204. Obradovic, Z. and Yan, P., (1990) “Small Depth Polynomial Size Neural Networks," Neural Computation, vol. 2, no. 4, pp. 402-404.
Neural Networks: Fully Refereed Conference Articles
205. Jovanovic, N., Milutinovic, V. and Obradovic, Z. (2002) “Foundations of Predictive Data Mining," Proc IEEE 6th
206. Pokrajac, D. and Obradovic, Z. (2001) “Neural Network-Based Method for Site-Specific Fertilization Recommendation," Proc. Society for Engineering in Agricultural, Food, and Biological Systems (ASAE) Annual International Meeting, 2001.
Conf. on Neural Networks Applications in Electrical Engineering, Belgrade, Yugoslavia, pp. 53-58.
207. Ngom A., Obradovic, Z. and Stojmenovic, I. (1998) “Minimization of Multivalued Multithreshold Perceptrons Using Genetic Algorithms," The 28th
208. Milenkovic, S., Obradovic, Z. and Litovski, V. (1996) “Annealing Based Dynamic Learning in Second-Order Neural Networks," Proc. IEEE Int. Conf. on Neural Networks, Washington D.C., pp. 458-463.
IEEE Int'l. Symp. On Multiple-Valued Logic, Fukuoka, Japan, pp. 209-214.
209. Drossu, R., Obradovic, Z. and Fletcher, J. (1996) “A Flexible Graphical User Interface to Heterogeneous Neural Network Simulators," Proc. 10th European Simulation Multiconference, Int. Society for Computer Simulation, Budapest, Hungary, pp. 273-278.
210. Venkateswaran, R., Obradovic, Z., and Raghavendra, C.V. (1996) “Cooperative Genetic Algorithm for Optimization Problems in Distributed Computer Systems," Proc. 2nd
211. Drossu, R., Lakshman, T.V., Obradovic, Z. and Raghavendra C.S. (1994) “Neural Network Techniques for Video Traffic Prediction," Proc. 6
Online Workshop on Evolutionary Computation, March 11-22, 1996, pp. 49-52. Also at WWW location http://www.bioele.nuee.nagoya-u.ac.jp/wec2/papers/p015.html.
th
212. Fletcher, J. and Obradovic, Z. (1994) “Constructively Learning a Near-Minimal Neural Network Architecture," Proc. IEEE Int’l. Conf. on Neural Networks, Orlando, FL, pp. 204-208.
Int’l. Workshop on Packed Video, Portland, pp. D.9.1-D.9.4.
213. Obradovic, Z. and Srikumar, R. (1994) “Evolutionary Design of Application Tailored Neural Networks," Proc. IEEE Int’l. Symp. Evolutionary Computation, Orlando, FL, pp. 284-289.
214. Perez, L.G., Flechsig, A.J., Meador, J.L. and Obradovic, Z. (1993) “Training an Artificial Neural Network to Discriminate Between Magnetizing Inrush and Internal Faults," IEEE Power Engineering Society 1993 Winter Meeting.
215. Obradovic, Z. and Parberry, I. (1990) “Analog Neural Networks of Limited Precision I:Computing with Multilinear Threshold Functions," in Advances in Neural Information Processing Systems 2, ed. D.S. Touretzky, San Mateo, CA: Morgan-Kaufmann, pp. 702- 709.
216. Obradovic, Z. and Parberry, I. (1990) “Learning with Discrete Multi-Valued Neurons," Machine Learning: Proc. 7th
217. Pokrajac, D. and Obradovic, Z. (2001) “Neural network-based software for precision farming," Proc. 2001 IEEE/INNS International Joint Conference on Neural Networks, Washington D.C.
Int’l. Conf., ed. B. W. Porter and R.J. Mooney, Austin, TX, Morgan-Kaufmann, pp. 392-399.
218. Perez, L.G. Flechsig, A.J.,Meador, J.L. and Obradovic,Z. (1993) "Training an Artificial Neural Network to Discriminate Between Magnetizing Inrush and Internal Faults," IEEE Power engineering Society 1993 Winter Meeting.
Neural Network: Lightly Refereed Conference Articles
219. Milenkovic, S., Litovski V. and Obradovic Z. (1996) “Nondeterminism in Artificial Neural Networks," Proc. Int’l. Memorial Conference “D.S.Mitrinovic", Nis, Yugoslavia.
220. Milenkovic, S., Litovski, V. and Obradovic, Z., (1993) “A New Adaptive Move Selection in Simulating Annealing," Proc. 15-16 Int. Annual School on Semiconductor and Hybrid Technologies, pp. 22-31, Sozopol, Bulgaria, 13-17 May, 1992-1993.
221. Meador, J. and Obradovic, Z., (1992), “A Connectionist AI Approach to Automatic Test," IEEE Pacific Test Workshop, Whistler, BC, Canada.
222. Obradovic, Z. and Srikumar, R., (1992) “Dynamic Evaluation of a Backup Hypothesis," Notes of the Neural Networks for Learning, Recognition, and Control Research Conference, G. A. Carpenter and S. Grossberg (eds.), the Wang Institute of Boston University.
223. Palmer, D., Obradovic, Z. and Allison, C. (1992) “Determining the Cause for Poor Performance of a Classification Learning System," Notes of the Neural Networks for Learning, Recognition, and Control Research Conference, G. A. Carpenter and S. Grossberg (eds.), the Wang Institute of Boston University.
Other Refereed Journal Articles:
224. Obradovic, Z., Potkonjak, M. and Obradovic, M. (1987) “Design of Efficient Algorithms for VLSI
Systolic Arrays," Informatica, vol. 21, pp. 153-159.
Other Refereed Conference Articles:
225. Obradovic, Z. and Obradovic, M., (1989) “Design of a New Parallel Language and Compiler Development," Proc. 11-th Int’l Symposium Computer at the University, Cavtat, pp. 3.6.1- 3.6.6.
226. Obradovic, Z. and Potkonjak, M., (1987) “Software Speed-up of VLSI Systolics with Idle Cells," Proc. 9th
227. Obradovic, Z. and Potkonjak, M., (1986) “A New Heuristic Algorithm for Solving Travelling Salesman Problem and Similar Problems," Proc. 8
Int’l. Symposium Computer at the University, Cavtat, pp. 2S.01.1-2S.01.4.
th Int’l. Symposium Computer at the University, Cavtat, vol. I, pp. 37.1-37.7.
228. Protic, V., Mladenovic, B. and Obradovic, Z., (1986) “An Environment for Microcomputer Development, Testing and Installation," Proc. 10-th BIH Symposium on Informatics, Jahorina, pp. 187.1-187.8.
Editorial Articles: 229. Obradovic, Z. and Liu, H. (2009) “Editorial: Special Issue on the Best of SDM’09,” Statistical
Analysis and Data Mining, vol. 2, no. 5-6, 291-293. SOFTWARE SYSTEMS:
Peer Reviewed Software: • Drossu, R., Obradovic, Z. and Fletcher, J. (1996) “The HDE and the BP Neural Network Simulators with a
TCL/TK Based Graphical User Interface," Peer reviewed. Co-sponsors: the IEEE Education Society, the IEEE Foundation, the National Science Foundation Division of Undergraduate Education, the IEEE Educational Activities Board, the IEEE Computer Society, the IEEE Technical Activities Board, the International Engineering Consortium, the American Society of Mechanical Engineers, and the IEEE Engineering Management Society. CD-ROM complementing the IEEE Trans. on Education special issue on Applications of Information Technology, M. Hagler (Ed.), IEEE Press, vol. 39, no. 3, August 1996, CD-ROM directory 18.
Commercial Software: • Romero, P., Obradovic, Z. and Dunker, A.K. (1999) “Protein Disorder Prediction Software," Washington
State University license transferred to Molecular Kinetics Inc.,. • Obradovic, Z. and Lazarevic A. (1999) “A Corporate Data Analyzis System for Identify the Attributes that
Drive Business Value," in use by Valueminer.com Inc. PATENT:
• Obradovic, Z., Fiez, T., Vucetic, S., Lazarevic, A., Pokrajac, D. and Hoskinson, R. “Systems and Methods for Knowledge Discovery in Spatial Data," United States Patent No. 6865582 (issued March 8, 2005).
PROFESSIONAL ACTIVITIES:
Executive Editor: • Statistical Analysis and Data Mining journal, Executive Editor for Applications, 2010 – present. Editorial Board Member: • International Journal of Computational Intelligence in Bioinformatics and Systems Biology, 2009 -Present • International Journal of Computational Models and Algorithms in Medicine, 2009 – Present. • Journal of Biomedicine and Biotechnology, 2008 – Present. • Advances in Bioinformatics, 2008 – Present • Statistical Analysis and Data Mining, 2006 – 2009 • International Journal of Parallel, Emergent and Distributed Systems, 2006 - Present • International Journal of Data Mining and Bioinformatics, 2005 – Present. • Multiple Valued Logic and Soft Computing, 1995 - Present. • IEEE Trans. on Education, 1997- 2001. • Journal of Computational Intelligence in Finance, 1995 - 1999.
Guest Editor:
• Statistical Analysis and Data Mining, The Best of SDM 2009 Issue (co-edited with H. Liu) • BMC Bioinformatics, Special Issue on First International Workshop on Text Mining in Bioinformatics (co-
edited with M. Song), vol. 8, supp. 9, 2007. • Knowledge and Information Systems, Special Issue on Selected and Revised Papers from KDD-2000
Workshop on Distributed and Parallel Knowledge Discovery, vol. 3, no. 4, 2001. (co-edited with J. Ghosh, H. Kargupta and V. Kumar)
• Journal of Computational Intelligence in Finance, Special Issue on Financial News Analysis using Distributed Data Mining, vol. 7, no. 2, March 1999, (co-edited with S.H. Rubin).
• Journal of Computational Intelligence in Finance, Special Issue on Hybrid Neural Networks for Financial Forecasting, vol. 5, no. 1, January 1997.
Program Chair: • The 4th
•
International Workshop on Mining Multiple Information Sources, in conjunction with IEEE International Conference on Data Mining, Sydney, Australia, Dec. 2010 (Co-Chair with R. Jin, X. Zhu, H. Wang). Ninth SIAM International Conference on Data Mining,
• IEEE 2007 International Conference on Bioinformatics and Biomedicine, San Jose, CA, Nov. 2007 (Program Co-Chair with X.T. Hu and I. Mandoiu; and Steering committee member).
Atlanta, Reno, April, 2009 (Program Co-Chair with H. Liu).
• The 39th Symposium on the Interface of Statistics, Computing Science and Applications, Philadelphia, PA, May 2007 (Co-Chair with A. Izenman).
• ACM First International Workshop on Text Mining in Bioinformatics, Arlington, MD, Nov. 2006 (Co-Chair with M. Song).
• Distributed and Parallel Knowledge Discovery Workshop, The Sixth ACM SIGKDD Int'l. Conf. on Knowledge Discovery and Data Mining, Boston, August, 2000 (Co-chair).
Track Chair: • The 17th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, San Diego, California,
Aug. 2011 (•
Senior Program Committee Member).
•
Eleventh SIAM International Conference on Data Mining, Phoenix, Arizona, April 2011 (Senior Program Committee Member).
•
Tenth SIAM International Conference on Data Mining, Columbus, Ohio, April 2010 (Senior Program Committee Member). Eight SIAM International Conference on Data Mining,
• The 2007 International Conference on Artificial Intelligence, Las Vegas, NV, June 2007 (Program Vice Chair).
Atlanta, Georgia, April, 2008 (Applications Track Chair).
• The 2007 International Conference on Bioinformatics and Computational Biology, Las Vegas, NV, June 2007 (Program Vice Chair).
• The 2007 International Conference on Genetic and Evolutionary Methods, Las Vegas, NV, June 2007 (Program Vice Chair).
• The 2007 International Conference on Scientific Computing, Las Vegas, NV, June 2007 (Program Vice Chair).
• IEEE 21st International Conference on Advanced Information Networking and Applications, Niagara Falls, Canada, May 2007 (Program Vice Chair for Distributed Database and Data Mining).
• Sixth SIAM International Conference on Data Mining, Bethesda, MD, April 2006 (Bio-Medical Informatics Track Chair
).
Steering Committee Member: • IEEE 2009 International Conference on Bioinformatics and Biomedicine, Washington, D.C., November,
2009. • 2010 Conference on Intelligent Data Understanding, NASA Ames Research Center.
Program Committee Member: • Second ICDM Workshop on Knowledge Discovery from Climate Data: Prediction, Extremes, and Impact,
held in conjunction with The IEEE International Conference on Data Mining (IEEE ICDM), Sydney, Australia, December, 2010.
• The 2010 ACM Second International Workshop on Data and Text Mining in Bioinformatics, in conjunction with CIKM 2010.
• The Third Conference on Intelligent Data Understanding, San Francisco Bay area, October 5-7th, 2010. • The 9th International Workshop on Data Mining in Bioinformatics, held in conjunction with The ACM
16th SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, Aug. 2010. • The 2010 International Conference on Advances in Social Networks Analysis and Mining, Odense,
Denmark, August, 2010. • The 8th International Conference on Machine Learning and Application (ICMLA 2009), Miami, Florida,
USA, December, 2009. • First ICDM Workshop on Knowledge Discovery from Climate Data: Prediction, Extremes, and Impact,
held in conjunction with The IEEE International Conference on Data Mining (IEEE ICDM), Miami, Florida, USA, December, 2009.
• The 15th
• The 7th International Conference on Machine Learning and Application (ICMLA 2008) San Diego, California, Dec., 2008.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Paris, France, June 2009.
• The 2008 IEEE International Conference on Bioinformatics and Biomedicine, Philadelphia, PA, Nov 7-9, 2008.
• The 2008 ACM Second International Workshop on Data and Text Mining in Bioinformatics, in conjunction with CIKM 2008, Napa Valley, California, Oct., 2008.
• 12th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Osaka, Japan, May, 2008. • Data Mining in Medicine Workshop, in conjunction with the IEEE International Conference on Data
Mining, Omaha, Nebraska, October, 2007. • 2nd
• 11th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Nanjing, China, May, 2007. IAPR Workshop on Pattern Recognition in Bioinformatics, Singapore, October, 2007.
• 2nd BioDM Workshop on Data Mining for Biomedical Applications, Nanjing, China, May, 2007. • Seventh SIAM International Conference on Data Mining, • 2007 IEEE Symposium Series on Computational Intelligence, Data Mining Symposium, Honolulu, Hawaii,
April 2007.
Minneapolis, Minnesota, April, 2007.
• 6th International Workshop on Data Mining in Bioinformatics, 12th
• 2006 IAPR Workshop Pattern Recognition in Bioinformatics, Hong Kong, Aug. 2006
ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining, Philadelphia, Aug. 2006.
• IEEE International Conference on Mechatronics and Automation, Luoyang, Henan, China, June 2006. • The 38th Symposium on the Interface of Statistics, Computing Science and Applications, Pasadena, CA,
May 2006. • 9th
• 5th IEEE Symposium on Bioinformatics and Bioengineering, Minneapolis, Minnesota, Oct. 2005. Workshop on Mining Scientific Datasets, Bethesda, Maryland, April 2006.
• IEEE Region 8 EUROCON Int’l Conf. on Computer as a Tool, Belgrade, Serbia, Nov. 2005. • 4th Int’l Conf. Computational Intelligence in Economics and Finance, special session on Forecasting
Volatility in Financial Market, Salt Lake City, Utah, July 2005. • 2005 IEEE Int’l Conf. Mechatronics and Automation, Niagara Falls, Canada, July 2005. • Fifth SIAM International Conference on Data Mining, Newport Beach, CA, April 2005. • Emerging Information Technology Conference, Princeton University, Oct. 2004. • Fourth SIAM International Conference on Data Mining, Orlando, FL, April 2004. • Bioinformatics Workshop at the Fourth SIAM International Conference on Data Mining, Orlando, FL,
April 2004. • 3rd Workshop on Bioinformatics in Data Mining (BIOKDD 2003), ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, Washington, DC, August 2003.
• Third SIAM International Conference on Data Mining, San Francisco, CA, May 2003. • Sixth Workshop on Mining Scientific Dataset, San Francisco, CA, May 2003. • Second Workshop on Data Mining in Bioinformatics, The Eight ACM SIGKDD Int'l Conf. on Knowledge
Discovery and Data Mining, Edmonton, Alberta, Canada, July 2002. • Int'l Conf. on Neural Networks Applications in Electrical Engineering, Belgrade, Serbia, September 2002. • Soft Computing in Financial Markets Conference, Int'l Congress on Computational Intelligence Methods
and Applications, Rochester Institute of Technology, N.Y., June 1999. • Distributed and Parallel Data Mining Workshop, Knowledge Discovery in Databases Conference, New
York City, N.Y., August 1998. • 4th Int'l Conf. Neural Networks Applications in Electrical Engineering, Belgrade, Serbia, September 1997.
Executive Committee Member:
• Greater Philadelphia Bioinformatics Alliance (BioAdvance, The Children’s Hospital of Philadelphia, Drexel University, Fox Chase Cancer Center, Penn State, Temple University, Thomas Jefferson University, University of Pennsylvania, University of the Sciences in Philadelphia, The Wistar Institute), 2002 – Present.
Advisory Board Member:
• The Bioinformatics and Medical Informatics Graduate Program and its associated research center, San Diego State University, 2008 – Present.
• International Artificial Intelligence Knowledge Society, 2005 – Present. Advising Expert:
Bioinformatics Faculty Recruitment Committee, Faculty of Science and Technology, Uppsala University, Sweden, 2003.
Grant Proposal Review Panel Member: • The National Science Foundation, Directorate for Computer and Information Science and Engineering,
Division of Information and Intelligent Systems, 1996, 1998, 1999, 2003, 2004, 2008. • The First Int'l Nonlinear Financial Forecasting Competition, Performance Analyst Evaluating
Prediction Strategy Entries, 1996. Keynote Lectures: • “Spatio-Temporal Characterization of Aerosols through Active Use of Data from Multiple Sensors,”
Keynote Lecture at the 3rd International Workshop on Mining Multiple Information Sources, in conjunction with IEEE International Conference on Data Mining, Miami, FL, Dec. 2009.
• “Knowledge Discovery from Biological Databases for Understanding Protein Disorder,” Keynote Lecture at the 2008 IEEE International Conference on Bioinformatics and Biomedicine, Philadelphia, PA, Nov., 2008.
• “Functions of Intrinsically Disordered Proteins and Relationship with Human Disease Network ,” Keynote Lecture at 12th Serbian Mathematics Congress, Novi Sad, August, 2008.
• “Data Mining Support for Retrieval and Analysis of Geophysical Parameters,” Plenary Lecture at the 10th IASTED International Conference on Intelligent Systems and Control, Cambridge, MA, November 2007.
Other Invited Lectures: • “Structured Regression by Continuous Conditional Random Fields and Multiple Noisy Oracles,” Serbian
Academy of Sciences and Arts, Belgrade, August 2010. • “Analysis of Temporal Social Networks and Approximation of the Markov Blanket in a kernel induced space,”
Dept. of Organizational Sciences, Univ. of Belgrade, Serbia, June 2010. • “Unfoldomics of Human Genetic Diseases,” Greater Philadelphia Bioinformatics Alliance Annual Meeting,
Drexel University, PA, November 2009.
• “Unfoldomics of Human Genetic Diseases,” workshop on Translational Bioinformatics: Bridging Bioinformatics and Biomedical Informatics in Translational Medicine, Conference on Innovations in Lifesciences and Healthcare, Bryan Mawr University, PA, October, 2009.
• “Computation Enabling Information Sciences: A Data Miner’s Perspective,” panel on Computation Enabling Information Sciences, Computational Engineering and Science/HPC workshop, Lehigh University, PA, October 2009.
• “Uncertainty Estimation and Selection of Sensor Sites in Remote Sensing Applications,” IEEE SCG Section and Dept. of Electrical Engineering at University of Belgrade, September, 2009.
• “Sequence Alignment and Structural Disorder: A Substitution Matrix for an Extended Alphabet,” Serbian Academy of Sciences and Arts, Belgrade, July 2009.
• “Dynamic Clustering-Based Estimation of Missing Values in Mixed Type Data,” Dept. of Organizational Sciences, Univ. of Belgrade, Serbia, June 2009.
• “Functions of Intrinsically Disordered Proteins and Relationship with Human Disease Network,” University of Minnesota, Sept. 2008.
• “Uncertainty Reduction in Gene Expression Data Analysis,” IEEE SCG Section and Dept. of Electrical Engineering at University of Belgrade, August, 2008.
• “Using Prior Knowledge to Reduce Uncertainty when Mining Microarray Data,” IBC's Chips to Hits/Discovery to Diagnostics Conference, Philadelphia, PA, September 2007.
• “Data Mining Support for Aerosol Optical Depth Retrieval and Analysis,” IEEE Section of Serbia and Monte Negro, Dept. of Electrical Engineering, Univ. of Belgrade, Serbia, Aug. 2007.
• “Data Mining Approach to Functional Characterization of Protein Disorder,” Serbian Academy of Sciences and Arts, Belgrade, Aug. 2007.
• “Data Mining Support for Retrieval and Analysis of Geophysical Parameters,” Computer Science Department, Drexel University, Feb. 2007.
• “Data Mining Approach to Functional Annotation of Protein Disorder,“ College of Information Science and Technology, Drexel University, Nov. 2006.
• “Data Mining Support for Aerosol Retrieval and Analysis,” 1st Workshop on the Assessment of Global Aerosol Product, Univ. Maryland, Sept. 2006.
• “Using Gene Ontology Graphs for Biomarkers Selection from Integrated Microarray, Proteomics and Clinical Data,” International Mathematical Conference - Topics in Mathematical Analysis and Graph Theory Conference, Belgrade, Serbia, Sept. 2006 (a satellite to International Congress of Mathematicians, Madrid, Aug. 2006).
• “Integration of Deterministic and Statistical Algorithms for Aerosol Retrieval,” 38th Symposium on the Interface of Statistics, Computing Science and Applications, Pasadena, CA, May 2006.
• “A Toolbox for Characterization of Gene Functional Expression Profiles,” Keynote lecture at Indiana Bioinformatics Conference, School of Medicine, University of Indiana, Indianapolis, May 2006.
• “Earth Science Applications of Data Mining,” Mathematics and Computer Science Dept., Saint Joseph’s Univ., March 2006.
• “Integration of Deterministic and Statistical Algorithms for Aerosol Retrieval,” Serbian Academy of Sciences and Arts, Belgrade, Sept. 2005.
• “A Toolbox for Characterization of Gene Functional Expression Profiles,” IEEE Section of Serbia and Monte Negro, Dept. of Electrical Engineering, Univ. of Belgrade, Serbia, Sept. 2005.
• “Characterization of Gene Functional Expression Profiles,“ School of Public Health, Univ. of Medicine and Dentisty New Jersey and American Statistical Association, New Jersey Chapter, April 2005.
• “Data Mining Approach to Study of Protein Disorder,“ Center for Advanced Biotechnology and Medicine, Univ. of Medicine and Dentisty New Jersey, April 2005.
• Data Fusion and Models Fusion for Efficient and Accurate Aerosol Retrieval,“ Jet Propulsion Laboratories and Caltech University, Pasadena, CA, March, 2005.
• “Data Mining for Efficient and Accurate Large Scale Retrieval of Geophysical Parameters,“ American Geophysical Union Fall Meeting, San Francisco, CA, Dec. 2004.
• “Data Mining Approach to Study of Protein Disorder,” Emerging Information Technology Conference, Princeton University, Oct. 2004.
• “Characterization of Gene Functional Expression Profiles of Plasmodium Falciparum,” Greater Philadelphia Bioinformatics Alliance Retreat, Oct. 2004.
• “Learning from Large Data Streams," IEEE Section of Serbia and Monte Negro, Dept. of Electrical Engineering, Univ. of Belgrade, Serbia, Sept. 2004.
• “Understanding Functions of Disordered Proteins through Data Mining,” School of Medicine, University of Belgrade, June 2004.
• “Bioinformatics Approach to Study of Protein Disorder,” Georgetown University, May 2004. • “Predicting Intrinsic Disorder from Amino Acid Sequence,” Harvard University, Aug. 2003. • “Exploiting Unlabeled Data for Improving Accuracy of Predictive Data Mining," Rockefeller University, June
2003. • “Exploring Bias in the Protein Data Bank Using Contrast Classifiers," Uppsala University, Sweden, May 2003. • “A Distribution Based System for Discovering Interesting Knowledge in Scientific Databases,"
Inauguration Workshop for the Linnaeus Centre for Bioinformatics, Uppsala University, Sweden, November 6, 2002.
• “Protein Disorder Prediction and Function Analysis," Center for Bioinformatics, University of Pennsylvania, Oct. 18, 2002.
• “Towards Solutions to Some Challenging Open Problems in Scientific Data Mining," The Mathematical Institute, Serbian Academy of Sciences and Arts, Belgrade, Serbia, Sept. 04, 2003.
• “Efficient Mining at Large Spatial Databases," IEEE Section of Yugoslavia, Dept. of Electrical Engineering, Univ. of Belgrade, Serbia, June 24, 2003.
• “Analysis of Deregulated Electricity Markets for Trading Optimizations," Plenary lecture, Balkan Power Conference 2002, Belgrade, Serbia, June 19, 2003.
• “Understanding Protein Disorder and Their Flavors," Bioinformatics and Genome Research 2002, Beyond Genome, Information and Ideas for the Post-genomic Era, San Diego, CA, June 4, 2002.
• “Controllable Data Reduction for Efficient Data Analysis of Spatial Databases," Fifth Workshop on Mining Scientific Datasets, Second SIAM Int'l Conf. on Data Mining, Arlington, VA, April 13, 2002.
• “Supervised Clustering of Disordered Proteins," Bioinformatics 2002, April 7, 2002 Bergen, Norway, April 17, 2002.
• “Data Reduction for Spatial Data Analysis,” Computer Engineering Dept., University of South California (January 18, 2002).
• “Knowledge Discovery in Spatial and Temporal Databases," Mathematical Challenges in Scientific Data Mining, NSF Institute for Pure and Applied Mathematics, University of California Los Angeles (January 18, 2002).
• “Commonness, Complexity, Flavors and Function of Intrinsic Protein Disorder: A Bioinformatics Study," Mathematical Challenges in Scientific Data Mining, NSF Institute for Pure and Applied Mathematics, University of California Los Angeles (January 17, 2002).
Grant Proposal Reviewer: • The National Science Foundation, Advanced Computational Research Program. • The National Science Foundation, Knowledge Models and Cognitive Systems Program. • The National Science Foundation, Computational Biological Activities Program. • The National Science Foundation, Computer and Information Science and Engineering Minority Career
Advancement Awards. • The US Department of Energy, SC-32. • The US Army Research Office, Life Science Division. • Science Foundation Ireland, Information and Communications Technology Directorate
UNIVERSITY SERVICE: • University Research and Creativity Award Committee member (2010, 2011) • The Sciences Subcommittee member of the University Graduate Board (2010) • President’s Tenure and Promotion Advisory Committee member (2004 – 2005).
• CST Dean Search Committee member (2004-2005). • CST College Promotion and Tenure Committee member (2003-2011). • CST Merit Committee member (2001, 2003, 2006, 2007, 2008, 2010). • CIS Executive Committee (2009, 2010) • CIS Chair of Faculty Search Committee (2001, 2002, 2007, 2008). • CIS Faculty Search Committee member (2009). • CIS Chair of Promotion and Tenure Committee (2006, 2008, 2009, 2010) • CIS Promotion and Tenure Committee member (2004). • CIS Advisor to Faculty Search Committee (2003). • CIS Chair of Research Committee (2005 – 2007). • CIS Research Committee member (2000-2005). • CIS Graduate Studies Committee member (2000-present). • EECS Graduate Program Coordinator for Computer Science (1997-2000). • EECS Undergraduate Program Coordinator for Computer Science (1992-1994). • EECS Personnel and Policy Committee elected member (1997-present). • EECS Task Force on Graduate Studies Policies member (1996-present). • CS Faculty Search committee chair (1999). • EECS Director Search committee member (1999). • CS Promotion and Tenure Committee chair (1999). • CS Third Year Evaluation Committee member (1999). • CS Faculty Search Committee member (1997). • 20 PhD and more than 30 M.S. thesis committees member (1991-present). • Interdisciplinary graduate students advising and committee service in Economics,
Management and Systems, Biochemistry, Chemistry, Crop and Soil Sciences, Psychology, Education, Electrical Engineering and Computer Science (1991-present).
• CS colloquium coordination (1991-1996). • CS Graduate Students Admission committee member (1994-1996). • Software Engineering technical committee member (1994-1996). • Computer Engineering Curriculum Development committee member (1991-1994). • Software Engineering Curriculum Development committee member (1991-1993). • Algorithmics Curriculum Development committee member (1991-1994). • Artificial Intelligence Curriculum Development committee member (1991-1994). • Undergraduate Studies committee member (1991-1994). • Boeing Chairman Professorship Faculty Search committee member (1991). TEACHING: Consistently receiving excellent student evaluations (significantly better than for department, college and university at all criteria) • Machine Learning (graduate course, taught in 2005, 2006) • Knowledge Discovery and Data Mining (graduate course, taught in 1998, 2003, 2004, 2008, 2010). • Data Warehousing, Filtering and Data Mining (graduate course, taught in 2001). • Neural Computation (graduate course, developed and taught in 1992, 1993, 1994, 1996 and 1999, 2001, 2006,
2009, 2011). • Parallel Computation (new graduate course, developed and taught in 1992, 1993 and 1996). • Artificial Intelligence (graduate course, taught in 1992 and 1993). • Algorithmics (graduate course, developed and taught in 1995, 1996, 1997, 1998). • Design and Analysis of Algorithms (undergraduate course, taught in 1994). • Automata and Formal Languages (undergraduate course, taught in 1994 and 1997). • Introduction to Artificial Intelligence (undergraduate course, taught in 1991 and 1995).
• Neural Network Design and Application (new undergraduate course, developed and taught in 1997). STUDENTS ADVISING:
Postdoctoral Associates and Visiting Scholars: Research: Bioinformatics
• Xiaohong Li (1999-2001) • Slobodan Vucetic (2001-2002) • Junping Wang (1999-2001) • Lining Yu (2006-present) • Vadim Ayuyev (2007-2008) • Zhongmei Shu (2008-present) • Dairong Wang (2009-present)
Current Ph.D. Students:
• Debashis Das - Research: Spatial and Temporal Data Mining
• Mohamed Ghalwash - Research: Bioinformatics • Solomon Jones
- Research: Health Informatics • Joseph Jupin
- Research: Data Fusion • Qiang Lou
- Research: Spatial and Temporal Data Mining • George Mathew
- Research: Health Informatics • Uros Midic
- Research: Bioinformatics • Zhang Ping
-Research: Bioinformatics • Yilian Qin
- Research: Spatial and Temporal Data Mining • Vladimir Ouzienko
-Research: Analysis of Social Science Data • Dusan Ramljak
-Research: Bioinformatics. • Vladan Radosavljevic
- Research: Spatial and Temporal Data Mining • Kosta Ristovski
- Research: Spatial and Temporal Data Mining • Alexey Uversky
-Research: Medical Informatics. Graduated Ph.D. Students:
• Qifang Xu - Dissertation: “Statistical Analysis of Biological Interactions of Homologous Proteins,” Computer and Information Science Ph.D., Temple University, Fall 2008.
- First Ph.D. position: Research Associate, Fox Chase Cancer Institute, Philadelphia • Michael Hongbo Xie - Dissertation:”Functional Characterization of Large Scale Biological Data,” Computer and
Information Science Ph.D., Temple University, Summer 2007.
- First Ph.D. position: Bioinformatics Specialist, IDF Research Department of Children's Hospital of Philadelphia
• Bo Han - Dissertation: “Knowledge Discovery by Fusion of Information,” Computer and Information Science Ph.D., Temple University, Summer 2007.
- First Ph.D. position: • Kang Peng
- Dissertation: “Learning from Protein Structure Related Data,” Computer and Information Science Ph.D., Temple University, Spring 2006. First Ph.D. position: Research Associate, School of Informatics, Indiana University, Bloomington.
• Predrag Radivojac - Dissertation: “Classification and Knowledge Discovery in Protein Databases," Computer and
Information Science Ph.D., Temple University, Fall 2003. - First Ph.D. position: Research Associate, School of Medicine, Indiana University, Indianapolis.
• Dragoljub Pokrajac - Dissertation: “Knowledge Discovery in Spatial-Temporal Databases," Computer and
Information Science Ph.D., Temple University, Summer 2002. - First Ph.D. position: Assistant Professor, Computer Science Dept., Delaware State University.
• Aleksandar Lazarevic - Dissertation: “Distributed Inductive Learning for Time/Space Data Analysis," Computer
and Information Science Ph.D., Temple University, Fall 2001. - First Ph.D. position: Research Associate, Army High Performance Computing Research
Center, Computer Science Dept., University of Minnesota. • Slobodan Vucetic
- Dissertation: “On-line Systems for Non-stationary Data Analysis and Modeling, ", Electrical Engineering Ph.D., WSU, Summer 2001. - First Ph.D. position: Visiting Assistant Professor, Center for Information Sciences and technology and Computer and Information Sciences Department, Temple University, Philadelphia, PA.
• Pedro Romero - Dissertation: “Knowledge Discovery and Data Mining in Protein Databases," Computer Science Ph.D., WSU, Spring 1999. - First Ph.D. position: Research Scientist, Artificial Intelligence Laboratories, Stanford Research Institute International, Menlo Park, CA.
• Radu Drossu - Dissertation: “Efficient Design of Neural Networks for Time Series Prediction," Computer Science Ph.D., WSU, Summer 1997. - First Ph.D. position: Staff Scientist, Financial Engineering Group, HNC Software Inc., San Diego, CA.
• Tim Chenoweth - Dissertation: “A Neural Network Based System for Predicting Future Returns for the S&P 500 Stock Index," Interdisciplinary Ph.D., WSU, Summer 1996. - First Ph.D. position: Assistant Professor (tenure track), School of Accountancy, Arizona State University.
• Srdjan Milenkovic - Dissertation: “Higher-Order Dynamic Learning Through Nondeterministic Global Optimization," Electrical Eng. Ph.D., University of Nis, Yugoslavia, co-advised with Prof. Vanco Litovski from Univ. of Nis, Summer 1996. - First Ph.D. position: Research Scientist, Microelectronics Centre, Middlesex University, London, United Kingdom.
• Justin Fletcher - Dissertation: “A Constructive Approach to Hybrid Architectures for Machine Learning," Computer Science Ph.D., WSU, Summer 1994. - First Ph.D. position: Principal Engineer, Itron Corp., Spokane, WA.
Graduated M.S. Students: • Stephen Muchmore
- Thesis: “Combining Point Level and Aggregated Tract Level Data to Improve Clustering of Adolescent Crime Data in Philadelphia,” Computer Science, M.S., Fall 2007.
• Yilian Qin - Thesis: “Support Vector Machine Reuse for Large Spatio-Temporal Datasets," Computer Science M.S., Spring 2005. - Continued towards a computer science Ph.D. at Temple University
• Tim Chenoweth - Project: “Learning Algorithms for Trading Systems Based on Biased Estimators," Computer Science M.S., Spring 1996. - Continued towards an Interdisciplinary Ph.D. at WSU.
• Srikumar Rangarajan - Thesis: “Design of Application-Tailored Neural Networks Using Genetic Algorithms," Computer Science M.S., Summer 1993. - First M.S. job at the Microsoft Inc., Redmond, WA.
• Anthony Kampka - Thesis: “A Stochastic Technique in Constructive Training of Artificial Neural Networks," Computer Science M.S. Fall 1992. - First M.S. job at the Exabyte Corp., Boulder, CO.
• Shailesh Vaishnavi - Project: “Storage Organization for Multiattribute Retrieval in CAD Databases," Computer Science M.S., Summer 1992. - First M.S. job at the Amdahl Corp., Sunnyvale, CA.
Undergraduate Students Advising:
• Bobby Parchuri - Penn State U. student trained in bioinformatics research in my laboratory (2006) • Mathew Fenty - Trained in bioinformatics research through a hands-on project in my laboratory (2005) • Josh Crean and Josh Hartwel - Research assistants on my spatial-temporal data analysis project (2004). • Timothy O'Connor
- Washington State U. student trained on my bioinformatics project with A.K Dunker (2001-2003). - Visiting scholar at my lab at Temple U. (Goldwater fellowship, 2003). - Accepted to graduate schools at Harvard, Univ. Washington and Princeton. Starts at Princeton Univ., Fall 2005. Started Ph.D. studies at Princeton U. Fall 2005.
• Ethan Garner - Research assistants on my project with A.K. Dunker (1997-2000). - Published 4 joint papers on our bioinformatics project. - Accepted to graduate school at Harvard, Stanford, Scripps, Wisconsin, UC Berkeley and UC San Francisco. Starting at UC San Francisco Fall 1999. - Elected to remain at our lab as a technician until Fall 2000 in order to publish several more papers.
• Radmila Sarac - Awarded a Howard Hughes fellowship to work on my bioinformatics research project (1997-1998).
• James Jungbauer - Trained and partially funded in my lab (1994).
• Chris Allison and David Palmer - Research assistants on my neural networks project (1992-1993). - Published a joint conference paper with me.
• Certified Undergraduate CS Program Advising (1992 -1994) (Advising all certified CS students, on average 53 students per year).
• Uncertified CS Students Academic Advising (1991 - 1997) (Advising on average 15 undergraduate students per year).
Other Research Service:
• Mentored twelve high school students on research projects in my lab (1994, 1996). • Mentored four high school teachers on summer research projects in my lab (1995, 1997). • Served as External evaluator/committee member for several Ph.D. theses in Asia and Europe (Univ.
of Ottawa, National Univ. of Singapore, Univ. of Belgrade, Univ. of Novi Sad, Univ. of Nis). AWARDS/RECOGNITION:
• H-index 39 according to Harzing's Publish or Perish (as of Aug. 2010) • Cited more than 6,200 times according to Harzing's Publish or Perish (as of Aug. 2010) • Author of the 3rd most cited article of all time across all volumes published by the Biochemistry
journal (as of Aug. 2010). The top 20 list is available as "Most Cited Papers, All Time" at the Biochemistry web site with more details at CrossRef's Linking service). The Biochemistry journal impact factor is >5 and it exists for > 100 years currently printing 8 volumes per year with 24 issues per volume.
• Temple University Faculty Research Award, April 2009. • College of Science and Technology Faculty Research Excellence Award, Nov. 2008. • Team leader for the best rated model of intrinsically disordered protein regions at the seventh critical
assessments of structure prediction experiments (CASP 7), Nov. 2006. • Team leader for the best predictor in protein disorder category at the sixth critical assessments of structure
prediction experiments (CASP 6), Nov. 2004. • Team leader for the best predictor in protein disorder category at the fifth critical assessments of structure
prediction experiments (CASP 5), Nov. 2002. • Researcher of the Year, College of Engineering and Architecture, Annual Convocation,
Washington State University, April 2000. PERSONAL: USA citizen.
Supplement 2
Resume/Bio Sketch
David R. Schwartz, MSW
CEO/President, Q-linx, Inc. EDUCATION/TRAINING
The University of Michigan, Ann Arbor BA 1994 Kinesiology The University of Pennsylvania MSW 1997 Social Work Macro Practice
Summary
Facilitates meetings with IBM sales teams, marketing teams, technical teams and their
customers in the U.S and globally for technology training, database development/analysis consulting, and sales purposes. Works collaboratively with IBM teams before, during, and after sales meetings, requiring frequent public speaking, presentations, meeting facilitation, executive level customer interaction, training needs assessments, and training.
Recognized as the first to train and test neural network technology (a powerful data
mining and pattern recognition technique) utilizing a nationally representative dataset with the purpose of augmenting decision-making and training in child welfare/human services organizations. With extensive multi-disciplinary collaboration, the novel technology (now sold by IBM to government agencies globally) has reached approximately 90% accuracy.
Dedicated to developing and supporting creative evidence-based technology solutions.
Extensive experience initiating and managing multi-disciplinary human services
technology development projects collaborating with professors from the University of Pennsylvania’s School of Engineering and Applied Sciences and School of Social Policy and Practice, several Temple University schools and colleges, the University of Michigan, and professional sales and consulting teams at IBM.
Founder and CEO/President 1998-Present Q-linx, Inc. Responsible for management of risk assessment/decision-support technology development
and consulting company to augment training and decision-making in the fields of education, health care, criminal justice, and social welfare.
Created risk assessment technology using computational intelligence techniques to aid worker decision-making and augment training.
Leads and manages several multi-disciplinary projects requiring sales/training of customers in addition to managing teams of computer science, engineering, and social science experts. For example, the New York’s Office of Children and Family Services data mining project with IBM/Q-linx.
Developed Q-linx, Inc.’s global partnership with IBM Global Social Services (1998-Present). Trains IBM sales teams in the U.S. and internationally on Q-linx risk assessment technology
for sales support purposes. Presents risk assessment technology alongside IBM global and local sales teams to
customers in the U.S., Canada, Australia, Japan, Israel, etc., facilitating question and answer sessions with their customers.
Oversees the development of grants and innovative technology-focused services in the child welfare, education, and healthcare fields. For example, presented innovative neural network training/risk assessment portion of a large U.S. Department of Education training grant at the national transition to teaching conference.
Consults on risk assessment/training-focused software solutions for for-profit, nonprofit and government organizations nationally and internationally.
Director of Development 1997-1998 Child, Inc., Wilmington, DE Overall responsibility for planning and development activities for a comprehensive child and
family services agency. Co-wrote a winning proposal for Violence Against Women Act funding. A shelter was
constructed for victims of domestic violence, with on-site treatment services focused on the prevention of future domestic violence with support, therapeutic, and training services on-site.
Case Manager 1994-1995 The Choice Program, Baltimore, MD Part of a team managing after school training programs for at-risk middle school students. Worked in Prince George’s County, Maryland office in a school-based juvenile justice
prevention and alternative to secure detention program. Violent and nonviolent youth offenders were admitted to the program, made daily school and
home visits, facilitated group sessions with clients, and advocated for the youth in the community.
Awards
Wharton Business School Journal Award, QLINX.com, Most Socially Responsible Business
Plan, 2000. Research Publications
Schwartz, D. R., Kaufman, A. B., & Schwartz, I. M. (2004). Computational intelligence techniques for risk assessment and decision support. Children and Youth Services Review, 26, 1081-1095
Jones, P. R., Schwartz, D. R., Schwartz, I. M., Obradovic, Z., & Jupin, J. (2006). Risk classification and juvenile dispositions: What is the state of the art? Temple Law Review, 79, 2, 461-498
Schwartz, I. M., Jones, P. R., & Schwartz, D. R. Improving Social Work Through the Use of Technology and Research, Child Welfare Research, edited by Duncan Lindsey and Aron Shlonsky. (2008). Oxford University Press: New York.
Selected Presentations
Numerous presentations and papers on augmenting training, risk assessment, and decision-making with technology have been delivered:
Facilitated numerous IBM master class presentations, sales team briefings/training, and customer education/training/program presentations at several APHSA IT Solutions Management for Human Services Conferences.
Risk Assessment Models and Empirical Validity: Making Life and Death Decisions, Panel Discussion, Conference Faculty/Panel Presenter. One Child Many Hands Conference, a Multi-Disciplinary Conference on Child Welfare, University of Pennsylvania. (2009).
Risk Assessment in Juvenile Justice: Identifying Best Practice. Upcoming presentation (2/24/2010) at the annual meeting of the American Society of Criminology (with Peter R. Jones and Ira M. Schwartz).
Risk Classification in Juvenile Justice and Child Welfare: The Dangers of Overconfidence. Paper presented at the annual meeting of the American Society of Criminology (with Peter R. Jones and Ira M. Schwartz). (2008).
The International Society for the Prevention of Child Abuse and Neglect, as accepted presenter and invited session chair/facilitator (Berlin, Germany, Warsaw, Poland, Lisbon, Portugal).
Temple University Medical School, Grand Rounds. National Symposium on Child Sexual Abuse.
Competencies
Data mining with large and small databases, data warehouse development/restructuring, artificial neural networks, fuzzy logic, pattern recognition/risk assessment with large social, health, human services, and criminal datasets, client meeting facilitation, consulting with executive clients, client sales presentations, client/sales team training, frontline worker MIS training and needs assessment, presentation development, project management, E-discovery technologies, multi-disciplinary team management, IT and business expert collaboration.
Contact Information
David R. Schwartz 128 Union Avenue Bala Cynwyd, PA 19004
[email protected] Phone: 610.733.7140
Supplement 3
Type of Document
Name of Document Bates Range, if applicable
Data, mdb files, and Databases
684v2000.mdb
n/a
Database.mdb
n/a
Resource.mdb
n/a
Staff.mdb
n/a
YI678 Database (PR800I.dbf; PRdata.dbf; PRhoh.dbf; RscChar.dbf; RscHHMmd.dbf; Rscrdata.dbf; rsrcbkck.dbf; rscrtrng.dbf)
n/a
YI684 Database (monitor.accdb; monitor.dbf; YI602.dbf; yi684bl2.dbf; YI684CL.dbf; YI684DL.dbf; YI684EL.dbf; yi684pl.dbf; YI701MBL.dbf)
n/a
YI701 Database (monitor.dbf; YI701MBL.dbf; yi701mcl.dbf; yi701mdl.dbf; yi701mel.dbf)
n/a
YI684 Data from 3/30/2010 Data Run
n/a
WebFOCUS Source Code for Access
CodeforAccess&KAHDS-00001-381
Gelona’s List of Resource Type Short Names
MGrissom-Gelona List-00001-00002
How To Link Access Data Tables
MGrissom-How to Link-00001-00003
YI684 Data 3.30.10 (Labels.htm; monitor.dbf; YI602.dbf; yi684bl2.dbf; YI684CL.dbf; YI684DL.dbf; YI684EL.dbf; YI684L.dbf; yi684pl.dbf; YI701MBL.dbf; yi701mcl.dbf; yi701mdl.dbf; yi701mel.dbf)
n/a
Type of Document
Name of Document Bates Range, if applicable
Query Lists
List of Non-Access Database Queries
Data C&R-4-00001-13
List of YI678 Queries
YI678-00001-4
List of YI684 Queries
YI684-00001-7
YI678 Queries with Descriptions
Data C&R-9-00001-15
YI684 Queries with Descriptions
Data C&R-11-00001-44
Current List of Non-Access Queries (incomplete)
KIDSRptList-11.1.10-00001-20
Current List of Non-Access Queries (complete)
KIDSRptList-11.1.10-00001-00044
List of YI701 Queries Listof701Queries-00001-6
Third List of Queries – Data C&R-4-00001-13 Contained in Access Databases
MGrissom-List of Queries-00001-00003
Emails with attachments
Emails re: Access database use WhiteA-004002, WhiteA-004003-4007, WhiteA-016446, WhiteA-016447
Emails re: issues with KIDS reports
Issuesw-AccessComm-00001-00096
Emails re: KIDS reports task force
Survey-Taskforce-00001-00005
Deposition Transcripts and Exhibits
Deposition transcript and exhibits of Mary Grissom, 10/1/2008
n/a
Deposition transcript and exhibits of Mary Grissom, 8/5/2010
n/a
Deposition transcript and exhibits of Mary Grissom, 9/7/2010
n/a
Deposition transcript and exhibits of John Gelona, 9/23/2010
n/a
Deposition transcript and exhibits of Jin Jew, 11/9/2010
n/a
Deposition transcript and exhibits of Nancy Elizabeth Roberts, 11/9/2010
n/a
Deposition transcript and exhibits of J.G. Nair, 12/1/2010
n/a
Type of Document
Name of Document Bates Range, if applicable
Standalone documents
Document entitled "Problems with YI 684 Queries in Litigation Discovery"
n/a
Children’s Safety Initiative: Oklahoma’s CW Practice Model Implementation and Training Plan
CWPMSC-4.2010-00001-13
Resume of Jayaprakash Nair Resumes-00003-7 Instructions for seeing the SQL source code generated by the ACCSES GUI previously supplied
AccessDBQueries-SQLInst-00001
Meeting minutes from CFSD Administrative Staff Meeting, 8/9/2010
LimitQueriesComm-00001-00002
KIDS Version Notes Data C&R-3-00001-450 KIDS Screen Fields to Data Elements
Data C&R-1-00001-72
Foster Care AFCARS Elements Data C&R-2-00001-29 KIDS Application Guide n/a KIDS Picklist Values KIDS Picklist Values-00001-267 Current Version of the KIDS Reports Page
KIDSRptPage-11.1.10-00001
Web FOCUS Headers WebFOCUSHeaders-00001-30 YI678 Data File Header Description Data C&R-12-00001-10 YI684 Data File Header Description Data C&R-10-00001-15 OKDHS Data Dictionary OKDHS Data Dictionary KIDS-
00001-02662 KIDS User Manual KIDS Manual-00001-00391
Supplement 4
Errors found in the 2009 and 2004 KIDS Version Notes The following are specific examples of errors that were found in the 2009 and 2004 KIDS version notes:
2009 KIDS Version Notes • Less than optimal Child Death Screen functionality (Data C&R-3-00438-39). • Error in Case Review functionality, error message when making multiple changes
(Data C&R-3-00446-47). • Disappearing Foster Care Claims, documentation issues/errors (Data C&R-3-00447). • Error in Financial Management – February Claims, not handling dates appropriately
(Data C&R-3-00447). • Correction needed in Client Information functionality for Case Connect Rollbacks
(Data C&R-3-00447). • "Difficulty" with Investigation Close – Out of Home functionality (Data C&R-3-00447). • Error in Private/Tribal functionality (Data C&R-3-00448). • Private/Tribal Adoption Case error, date acceptance problem (Data C&R-3-00448). • Error in the AWOL Warrant Information function, missing information after saving it
(Data C&R-3-00448). • Error in Reports-Resource Contacts function, time frame issues/errors (Data C&R-3-
00449). 2004 KIDS Version Notes
• Errors are generated when documenting the Investigation Interview date, populating at the year 1900 if certain information is added (Data C&R-3-00017).
• Errors are generated when working with the Individual Service Plans, copying/populating over wrong dates (Data C&R-3-00017).
• Database errors in the Parental Rights Fast Add function (Data C&R-3-00018). • Error in the Pre-Resource Report (Data C&R-3-00018). • Error in CWS-KIDS-1 Referral Information Report; not printing out injury specifics
properly (Data C&R-3-00018). • Error in the CWS-KIDS-25 Progress Reports, not populating correct fields when
printing Progress Reports (Data C&R-3-00019). • Error in the Case Review Manual Assignment function (Data C&R-3-00020). • Error in the On Call-Organization function, did not allow for times; only dates could be
added (Data C&R-3-00034). • Error because workers can select bad labels; Out of State or Tribal Jurisdiction not
valid (Data C&R-3-00035). • Error in the OCS Referral Screen; not populating names correctly (Data C&R-3-
00036). • Error in a critical Referral Information Report, CWS-KIDS-1; not numbering sections
correctly when user would print (Data C&R-3-00036). • Error in the Visitation Episode screen (Data C&R-3-00036). • Error in the Investigation Assessment/Investigation Close Date function (Data C&R-3-
00036-37). • Error in the Adoption Zip Code field (Data C&R-3-00046).
• Error in the Court Hearing function; populating wrong information (Data C&R-3-00046).
• Bad data populating in Reports (Data C&R-3-00047). • Error in the Mental Health Commitment functionality, unable to record information
properly (Data C&R-3-00047). • Errors in AFCARS goals functionality; data not pulling to reports (Data C&R-3-00048). • Error in Child's Needs Assessment, past assessments not read-only, workers could
manipulate past assessments (Data C&R-3-00048). • Errors (“item does not pass validation test”) are generated when documenting
important information about children's Treatment Plans (Data C&R-3-00050-51).