Report Intern Final

download Report Intern Final

of 33

Transcript of Report Intern Final

  • 7/31/2019 Report Intern Final

    1/33

    1

    Chapter 1

    Introduction

    1.1 Organization Background

    WHO-IPD (Programme for Immunization Preventable Diseases) formerly known as

    Polio Eradication Nepal (PEN) provides technical support to the Ministry of Health

    and Population (MoHP) for vaccine preventable diseases (VPDs) in Nepal. Since its

    establishment in 1998, IPD has been supporting the Government of Nepals endeavor

    to strengthen the surveillance of acute flaccid paralysis (AFP for polio), measles,neonatal tetanus , acute encephalitis syndrome (AES for Japanese encephalitis), and

    the routine immunization programme. IPD supports these activities in close

    collaboration with the Child Health Division, Epidemiology and Diseases Control

    Division, and National Public Health Laboratory under the Department of Health

    Services of Ministry of Health and Population. IPD currently has 11 field offices with

    15 surveillance medical officers. All of the IPD field offices operate in close

    coordination with the Regional Health Directorates and the District (Public) Health

    Offices to carry out the surveillance and immunization related activities.

    1.1.1 IPDs Core Activities

    1. Surveillance of VPDs

    2. Surveillance support for other infectious diseases

    3. Support for routine and supplementary immunization activities

    4. Support in policy formulation and strategy development for the NationalImmunization Programme. (NIP)

    5. Research, publication and dissemination of surveillance information and

    guidelines

    6. Social mobilization

    7. Coordination with partners

    8. Technical support to MOHP for laboratory diagnosis of VPDs

  • 7/31/2019 Report Intern Final

    2/33

    2

    1.2 Schedules for Internship

    S.NLearning objectives April May June

    Week 3

    ((16-22)

    Week 4

    (23-30)

    Week 1

    (1-8)

    Week 2

    (9-16)

    3 4 1 2 3 4

    1. Learn about

    designing forms,

    Modifying database

    Assist in day to day

    activities.

    Develop a conversion

    software

    2. understand dataverification by

    checking for

    inconsistencies

    3. generating analysis

    tools like graphs,

    maps etc

    4. . Know about

    importing and

    exporting data from

    various file format

    to and from database

    5. Understanding

    querying on single

    & multiple relations

    6. Know about

    backup, restore and

    recovery

  • 7/31/2019 Report Intern Final

    3/33

    3

    1.3 Task performed during internship

    1.3.1 Data entry:

    Since the advent of computers, and since the beginning of typing, the need to

    collect and neatly present documents has required data entry.

    Data entry is the act of transcribing some form of data into another form, usually a

    computer program. Forms of data that people might transcribe include handwritten

    documents, information off spreadsheets from another computer program, sequences

    of numbers, letters and symbols that build a program, or simple data like names and

    addresses.

    Disease surveillance is the routine ongoing collection, analysis and dissemination of

    health data that includes the detection and notification of health events, investigation

    and confirmation of cases or outbreaks, creation of reports, provision of feedback, and

    feed-forward to the higher levels for public health interventions. To direct these

    interventions the surveillance team must provide detailed epidemiological

    information.

    1.3.2 Data Verification and Validation

    Data verification is a process which ensures the completeness, correctness, and

    compliance of a data set against the applicable needs or specifications. This we all

    know. We even know data verification takes into consideration the double check of

    procured data to correct all the necessary human errors against the actual information

    gathered.

    The purpose of data verification is to ensure that the stored data can be easily located

    and found whenever searched for irrespective of technical specifications, the location,

    and the source. So that effectively it can help in accelerates organizational processes.

    Data validation is the process of ensuring that a program operates on clean, correct

    and useful data. It uses routines, often called "validation rules" or "check routines",

    that check for correctness, meaningfulness, and security of data that are input to the

    http://www.wisegeek.com/what-is-a-spreadsheet.htmhttp://en.wikipedia.org/wiki/Validation_rulehttp://en.wikipedia.org/wiki/Validation_rulehttp://www.wisegeek.com/what-is-a-spreadsheet.htm
  • 7/31/2019 Report Intern Final

    4/33

    4

    system. The rules may be implemented through the automated facilities of a data

    dictionary, or by the inclusion of explicit application program validation logic.

    The simplest data validation verifies that the characters provided come from a valid

    set. For example, telephone numbers should include the digits and possibly the

    characters (plus, minus, and brackets). A more sophisticated data validation routine

    would check to see the user had entered a valid country code, i.e., that the number of

    digits entered matched the convention for the country or area specified.

    Validation methods

    Allowed character checks

    Checks that only expected characters are present in a field. For example a

    numeric field may only allow the digits 0-9, the decimal point and perhaps a

    minus sign or commas.

    Consistency checks

    Checks fields to ensure data in these fields corresponds, e.g. If District Code=

    "KTM", then District name = "Kathmandu".

    Data type checks

    Checks the data type of the input and give an error message if the input data

    does not match with the chosen data type, e.g., In an input box accepting

    numeric data, if the letter 'O' was typed instead of the number zero, an error

    message would appear.

    Format or picture check

    Checks that the data is in a specified format (template), e.g. dates have to be in

    the format DD/MM/YYYY.Regular expressions should be considered for this type of validation.

    Limit check

    Unlike range checks, data is checked for one limit only, upper OR lower, e.g.,

    data should not be greater than 2 (

  • 7/31/2019 Report Intern Final

    5/33

    5

    Presence check

    Checks that important data are actually present and have not been missed out,

    e.g., Patient name must always be specified.

    Range check

    Checks that the data lie within a specified range of values, e.g., the month of a

    person's date of birth should lie between 1 and 12.

    Spelling and grammar check

    Looks for spelling and grammatical errors.

    Uniqueness check

    Checks that each value is unique. This can be applied to several fields but

    mostly applied to primary keys.

    Table Look up Check

    A table look up check takes the entered data item and compares it to a valid

    list of entries that are stored in a database table.for e.g. it is used to validate the

    names of villages or municipalities.

    1.3.3 Data Cleaning

    Data cleansing, data cleaning, or data scrubbing is the process of detecting and

    correcting (or removing) corrupt or inaccurate records from a record set, table, or

    database. Used mainly in databases, the term refers to identifying incomplete,

    incorrect, inaccurate, irrelevant, etc. parts of the data and then replacing, modifying,

    or deleting this dirty data.

    After cleansing, a data set will be consistent with other similar data sets in the system.

    The inconsistencies detected or removed may have been originally caused by user

    entry errors, by corruption in transmission or storage, or by different data dictionary

    definitions of similar entities in different stores.

    Data cleansing differs from data validation in that validation almost invariably means

    data is rejected from the system at entry and is performed at entry time, rather than on

    batches of data.

    http://en.wikipedia.org/wiki/Storage_recordhttp://en.wikipedia.org/wiki/Table_%28database%29http://en.wikipedia.org/wiki/Databasehttp://en.wikipedia.org/wiki/Dirty_datahttp://en.wikipedia.org/wiki/Data_sethttp://en.wikipedia.org/wiki/Data_dictionaryhttp://en.wikipedia.org/wiki/Data_validationhttp://en.wikipedia.org/wiki/Data_validationhttp://en.wikipedia.org/wiki/Data_dictionaryhttp://en.wikipedia.org/wiki/Data_sethttp://en.wikipedia.org/wiki/Dirty_datahttp://en.wikipedia.org/wiki/Databasehttp://en.wikipedia.org/wiki/Table_%28database%29http://en.wikipedia.org/wiki/Storage_record
  • 7/31/2019 Report Intern Final

    6/33

    6

    The actual process of data cleansing may involve removing typographical errors or

    validating and correcting values against a known list of entities. The validation may

    be strict (such as rejecting any address that does not have a valid postal code) or fuzzy

    (such as correcting records that partially match existing, known records).

    All data is dirty or inconsistent. That's one of the rules of computer- assisted

    reporting. That means, perforce, that all data must also be cleaned. Data cleaning can

    be performed with update, find and replace, lookup table. But we should always work

    off a copy of your table to preserve the original data in its raw form.

    1.3.4 Data Backup

    Backup or the process of backing up is making copies of data which may be used to

    restore the original after a data loss event.

    Backups have two distinct purposes. The primary purpose is to recover data after its

    loss, be it by data deletion or corruption. Data loss can be a common experience of

    computer users. The secondary purpose of backups is to recover data from an earlier

    time, according to a user-defined data retention policy, typically configured within abackup application for how long copies of data are required. Though backups

    popularly represent a simple form of disaster recovery, and should be part of a

    disaster recovery plan, by themselves, backups should not alone be considered

    disaster recovery.

    1.3.5 Data Analysis

    Analysis of data is a process of inspecting, cleaning, transforming, and modeling datawith the goal of highlighting useful information, suggesting conclusions, and

    supporting decision making. Data analysis has multiple facets and approaches,

    encompassing diverse techniques under a variety of names, in different business,

    science, and social science domains.

    Type of data

    http://en.wikipedia.org/wiki/Typographical_errorhttp://en.wikipedia.org/wiki/Postal_codehttp://en.wikipedia.org/wiki/Fuzzy_logichttp://en.wikipedia.org/wiki/Datahttp://en.wikipedia.org/wiki/Data_losshttp://en.wikipedia.org/wiki/Data_corruptionhttp://en.wikipedia.org/wiki/Data_retentionhttp://en.wikipedia.org/wiki/Disaster_recoveryhttp://en.wikipedia.org/wiki/Datahttp://en.wikipedia.org/wiki/Informationhttp://en.wikipedia.org/wiki/Informationhttp://en.wikipedia.org/wiki/Datahttp://en.wikipedia.org/wiki/Disaster_recoveryhttp://en.wikipedia.org/wiki/Data_retentionhttp://en.wikipedia.org/wiki/Data_corruptionhttp://en.wikipedia.org/wiki/Data_losshttp://en.wikipedia.org/wiki/Datahttp://en.wikipedia.org/wiki/Fuzzy_logichttp://en.wikipedia.org/wiki/Postal_codehttp://en.wikipedia.org/wiki/Typographical_error
  • 7/31/2019 Report Intern Final

    7/33

    7

    Data can be of several types

    Quantitative data is a number

    Often this is a continuous decimal number to a specified number of

    significant digits

    Sometimes it is a whole counting number

    Categorical data : data one of several categories

    Qualitative data : data is a pass/fail or the presence or lack of a characteristic

    Data analysis and report generation can be performed by Pivoting and charting.

    1.3.6 Performing Queries

    Queries can be used to quickly analyze and sort information that is in an Access

    database. A query allows to present a question to your database by specifying specific

    criteria.

    Queries allow to specify:

    The table fields that appear in a query.

    The order of the fields in a query.

    Filter and sort criteria for each field in a query.

    Queries have two views:

    Design view Datasheet view

    In the Design view, we specify which tables we want to see, which tables they

    come from, and the criteria that records have to meet in order to appear on the

    resulting database. Criteria are tests that records have to pass. In the Query

    Datasheet view, we view the records that are found to meet your criteria.

    http://en.wikipedia.org/wiki/Quantitative_datahttp://en.wikipedia.org/wiki/Decimalhttp://en.wikipedia.org/wiki/Significant_digitshttp://en.wikipedia.org/wiki/Countinghttp://en.wikipedia.org/wiki/Categorical_datahttp://en.wikipedia.org/wiki/Categorical_datahttp://en.wikipedia.org/wiki/Qualitative_datahttp://en.wikipedia.org/wiki/Qualitative_datahttp://en.wikipedia.org/wiki/Qualitative_datahttp://en.wikipedia.org/wiki/Categorical_datahttp://en.wikipedia.org/wiki/Countinghttp://en.wikipedia.org/wiki/Significant_digitshttp://en.wikipedia.org/wiki/Decimalhttp://en.wikipedia.org/wiki/Quantitative_data
  • 7/31/2019 Report Intern Final

    8/33

    8

    When we run a query, Access pulls data out of tables and puts the data in a database

    for us to see. The original table and database stay connected, so that if we make

    changes to the data on the database, the results of the query also change.

    A select query can be used to select certain data from a table or tables. It basically

    filters and sorts the data and can perform simple calculations, such as summing and

    averaging.

  • 7/31/2019 Report Intern Final

    9/33

    9

    1.4 Statement of Problem and Objectives

    WHO-IPD receives the surveillance data from 11 field offices and 15 medical officers

    from across the country. The data is received in Microsoft Excel format which isbeing manually entered into the Access database.

    The problem with this approach is:

    1. Time consuming

    2. Prone to human error during data insertion.

    3. Resource Overhead

    Hence, It was determined that automated computerized system to update the database

    with the periodic surveillance data would do benefit to the organization. And the task

    of developing prototype software for this objective was given to me during the period

    of my internship.

  • 7/31/2019 Report Intern Final

    10/33

    10

    1.5 Literature review and methodology

    There are a number of tools available for the purpose of converting spreadsheet files

    into access database. Also the Microsoft Access itself has an option to export datafrom the excel files.

    But the WHO-IPD requires the customized converter as some of the fields are

    dependent upon the other, some need to be removed and some fields should be added

    and data need to be validated before insertion into database.

    The conversion of .xls file to .mdb file will be a 2 step process:

    1. Convert .xls file to .csv (Comma Separated Variable) format.

    We will make use of ExcelDataReader v.2.0.1.0 , a Lightweight and fast library

    written in C# for reading Microsoft Excel files ('97-2007). Thus read rows of source

    excel file will be converted to CSV file in the first phase.

    2. Bulk insert the data from CSV file to access database.

  • 7/31/2019 Report Intern Final

    11/33

    11

    Chapter 2

    System Analysis

    System development can generally be thought of having two major components:

    systems analysis and systems design. In System Analysis more emphasis is given to

    understanding the details of an existing system or a proposed one and then deciding

    whether the proposed system is desirable or not and whether the existing system

    needs improvements. Thus, system analysis is the process of investigating a system,

    identifying problems, and using the information to recommend improvements to the

    system. During System analysis 2 types of requirement analysis is performed:

    2.1 Functional Requirement

    The system should be able to convert flat excel files to database files. Functional

    requirements explain what has to be done by identifying the necessary task, action or

    activity that must be accomplished. Functional requirements analysis will be used as

    the top level functions for functional analysis.

    2.2 Non Functional Requirement

    Non-functional requirements are requirements that specify criteria that can be

    used to judge the operation of a system, rather than specific behavior. Non functional

    requirement are the quality requirements that can guarantee or promise how well

    software do. Non Functional Requirement is specified as:

    Performance:

    It is the measure of the response time. It is the issue concerned with Short response

    time for a given piece of work , High throughput (rate of processing work), Lowutilization of computing resource( s) , High availability of the computing system or

    application.

    Reliability

    http://en.wikipedia.org/wiki/Functional_requirementhttp://en.wikipedia.org/wiki/Functional_requirementhttp://en.wikipedia.org/wiki/Non-functional_requirementhttp://en.wikipedia.org/wiki/Response_time_%28technology%29http://en.wikipedia.org/wiki/Response_time_%28technology%29http://en.wikipedia.org/wiki/Throughputhttp://en.wikipedia.org/wiki/Computing_resourcehttp://en.wikipedia.org/wiki/High_availabilityhttp://en.wikipedia.org/wiki/High_availabilityhttp://en.wikipedia.org/wiki/Computing_resourcehttp://en.wikipedia.org/wiki/Throughputhttp://en.wikipedia.org/wiki/Response_time_%28technology%29http://en.wikipedia.org/wiki/Response_time_%28technology%29http://en.wikipedia.org/wiki/Non-functional_requirementhttp://en.wikipedia.org/wiki/Functional_requirementhttp://en.wikipedia.org/wiki/Functional_requirement
  • 7/31/2019 Report Intern Final

    12/33

    12

    It means the extent to which program performs with required precision. The system

    developed should be extremely reliable and error free. Reliability is often measured as

    probability of failure, frequency of failures, or in terms of availability.

    Usability

    The application should be user friendly and should require least effort to operate.

    Flexibility

    It is effort required to modify operational program. The whole application should be

    made using independent modules so that any changes done in one module should not

    affect the other one and new modules can be added easily to increase functionality.

    http://en.wikipedia.org/wiki/Probabilityhttp://en.wikipedia.org/wiki/Availabilityhttp://en.wikipedia.org/wiki/Availabilityhttp://en.wikipedia.org/wiki/Probability
  • 7/31/2019 Report Intern Final

    13/33

    13

    Chapter 3

    3. System Design3.1 DFD(Data Flow Diagram)

    DFD is a picture of the movement of data between external entities and the processes

    and data stores within a system.

    Excel File

    Filename

    Access Database filename Selected file

    Datarecords

    csv file

    file name Sheet

    CSV file

    Csv file File and folder name

    Figure 1: DFD of the system to convert excel to access format.

    6

    Insert intodatabase

    5

    Select CSV file

    USER

    1

    Select ExcelFile

    4

    Convert File

    3

    Specify Outputfilename and

    folder

    2

    Select Sheet

  • 7/31/2019 Report Intern Final

    14/33

    14

    3.2 Use Case Diagram

    A Use case is a list of steps, typically defining interactions between a role/ actor and a

    system, to achieve a goal. The actor can be a human or an external system.

    3.2.1 ( Excel to CSV conversion module)

    USER

    Select Excel file

    Select the sheet

    Specify output folderand file name

    Convert Excel file toCSV

  • 7/31/2019 Report Intern Final

    15/33

    15

    3.2.2 (Module to insert into database)

    USER

    Select CSV file

    Specify Database

    Insert into database

  • 7/31/2019 Report Intern Final

    16/33

    16

    3.3 External Design:

    The goal of external design is to create a description of all elements of the application which

    interact with users or external systems.

    3.3.1 User Interface

    User Interface Design is concerned with how users add information to the system and with

    how the system presents information back to them.

  • 7/31/2019 Report Intern Final

    17/33

    17

    3.4 Internal Design

    3.4.1 Database Design

    Database design is the process of producing a detailed data model of a database. This logical

    data model contains all the needed logical and physical design choices and physical storage

    parameters needed to generate a design in a Data Definition Language, which can then be

    used to create a database. A fully attributed data model contains detailed attributes for each

    entity.

    Field Name Data Type

    CaseID TEXT

    OutbreakID TEXT

    Source TEXT

    SourceName TEXT

    PatientName TEXT

    COUNTRY TEXT

    Region TEXT

    DISTRICT TEXT

    VDC/Muni TEXT

    Ward NUMBER

    DNOT DATE/TIME

    DOI DATE/TIME

    DOB DATE/TIME

    SEX TEXT

    Ageyr NUMBER

    Agem NUMBER

    AGRP TEXT

    Vaccinated TEXT

    Hospitalized TEXT

    OUTCOME TEXT

    http://en.wikipedia.org/wiki/Data_modelhttp://en.wikipedia.org/wiki/Databasehttp://en.wikipedia.org/wiki/Logical_data_modelhttp://en.wikipedia.org/wiki/Logical_data_modelhttp://en.wikipedia.org/wiki/Data_Definition_Languagehttp://en.wikipedia.org/wiki/Data_Definition_Languagehttp://en.wikipedia.org/wiki/Logical_data_modelhttp://en.wikipedia.org/wiki/Logical_data_modelhttp://en.wikipedia.org/wiki/Databasehttp://en.wikipedia.org/wiki/Data_model
  • 7/31/2019 Report Intern Final

    18/33

    18

    DONSET DATE/TIME

    RASH TEXT

    FEVER TEXT

    COUGH TEXT

    CORYZA TEXT

    CONJUNCTIVITIS TEXT

    VitaminA TEXT

    AnySpecimen TEXT

    DateBlood DATE/TIME

    LabID TEXT

    Serum TEXT

    Rubella TEXT

    DateUrine DATE/TIME

    UrineResult TEXT

    DateThroatSwab DATE/TIME

    ThroatSwabResult TEXT

    CLASS TEXT

    Fup TEXT

    DFUP TEXT

    Travel TEXT

  • 7/31/2019 Report Intern Final

    19/33

    19

    Chapter 4

    4. Implementation

    After system analysis and system design phase, comes the implementation phase. It

    includes :

    1.Coding

    2.Testing

    3.Installation

    4.Documentation

    5.Maintenance

    4. 1 Coding

    During coding, physical design is converted to program by coding programming

    language. Coding is the process of designing, writing, testing, debugging, and maintaining

    the source code of computer programs. This source code is written in one or more

    programming languages. The coding and testing process are parallel executed during

    the project implementation phase. The software was developed using .Net framework with C# for coding and OLEDB .

    4.2 Testing

    System testing is the stage of implementation that is aimed at ensuring that the

    system works accurately and efficiently before live operation commences. Testing is

    vital to the success of the system. System testing makes logical assumption that if all

    the parts of the system are correct, then the goal will be successfully achieved. A

    series of testing are done for the system before the system is ready for the user

    acceptance testing.

    The following are the types of Testing:

    1. Unit Testing

    2. Integration Testing

    3. Validation Testing

    http://en.wikipedia.org/wiki/Software_designhttp://en.wikipedia.org/wiki/Software_testinghttp://en.wikipedia.org/wiki/Debugginghttp://en.wikipedia.org/wiki/Source_codehttp://en.wikipedia.org/wiki/Computer_programhttp://en.wikipedia.org/wiki/Programming_languagehttp://en.wikipedia.org/wiki/Programming_languagehttp://en.wikipedia.org/wiki/Computer_programhttp://en.wikipedia.org/wiki/Source_codehttp://en.wikipedia.org/wiki/Debugginghttp://en.wikipedia.org/wiki/Software_testinghttp://en.wikipedia.org/wiki/Software_design
  • 7/31/2019 Report Intern Final

    20/33

    20

    Unit Testing:

    The procedure level testing is made first. By giving improper inputs, the errors

    occurred are noted and eliminated.

    Initially the CSV conversion module is tested to verify that the excel files are

    being correctly converted to the CSV format and the insertion module is tested to

    verify that

    Integration Testing

    Testing is done for each module. After testing all the modules, the modules are

    integrated and testing of the final system is done with the test data, specially designed

    to show that the system will operate successfully in all its aspects conditions. Thus the

    system testing is a confirmation that all is correct and an opportunity to show the user

    that the system works.

    Validation Testing

    The final step involves Validation testing, which determines whether the

    software function as the user expected. The end-user rather than the system developer

    conduct this test most software developers as a process called Alpha and Beta

    Testing to uncover that only the end user seems able to find.

    The compilation of the entire project is based on the full satisfaction of the end

    users. In the project, validation testing is made in various forms. In question entry

    form, the correct answer only will be accepted in the answer box. The answers otherthan the four given choices will not be accepted.

    4.3.Installation

    Installation is the act of making the program ready for execution.

    http://en.wikipedia.org/wiki/Execution_%28computing%29http://en.wikipedia.org/wiki/Execution_%28computing%29
  • 7/31/2019 Report Intern Final

    21/33

    21

    4.4.Documentation

    Software documentation or source code documentation is written text that accompanies

    computer software. It either explains how it operates or how to use it, and may meandifferent things to people in different roles.

    Documentation is an important part of software engineering. Types of documentation

    include:

    1. Requirements - Statements that identify attributes, capabilities,

    characteristics, or qualities of a system. This is the foundation for what shall

    be or has been implemented.

    2. Architecture/Design - Overview of software. Includes relations to an

    environment and construction principles to be used in design of software

    components.

    3. Technical - Documentation of code, algorithms, interfaces, and APIs.

    4. End User - Manuals for the end-user, system administrators and support

    staff.

    5. Marketing - How to market the product and analysis of the market demand.

    http://en.wikipedia.org/wiki/Computerhttp://en.wikipedia.org/wiki/Softwarehttp://en.wikipedia.org/wiki/Documentationhttp://en.wikipedia.org/wiki/Documentationhttp://en.wikipedia.org/wiki/Softwarehttp://en.wikipedia.org/wiki/Computer
  • 7/31/2019 Report Intern Final

    22/33

    22

    4.4.1 End User Documentation:

    1.Step 1Select the excel filename, by clicking on the browse button.

    2.Step2 Select the sheet from the drop down list that displays the names of sheet in

    the selected excel file.

  • 7/31/2019 Report Intern Final

    23/33

    23

    3.Step 3 Specify the output filename and output folder and then click the convert

    button. It will display the File Converted message box. Now the excel file is

    converted to CSV file.

  • 7/31/2019 Report Intern Final

    24/33

    24

    4.Step 4 Then select the file to be inserted and specify database name, then click

    INSERT button. It will display a message box successfullt inserted.

  • 7/31/2019 Report Intern Final

    25/33

    25

    Chapter 5

    Limitations and future Enhancements

    During testing with other data, it was successful in converting other excel files

    to access file but in case of VPDIFA database, before implementation, the

    excel file should be pre-processed to match the field types defined in VPDIFA

    which is based on MS Access.

    Preprocessing task includes:

    The header information in the excel file should always be removed. But once the header information is removed, outbreak code and

    outbreak date reported (if any) would not be accessible. Also as data

    related to outbreak spans on multiple cells it can t be extracted, hence

    should be dealt with manually.

    Swap the position of columns in excel file to the position in database

    file. Separate VDC/Municipality name and ward number. Identify the age group.

    Limitations of this Software:

    It assumes that Source column has always only the numbers 1,2,3,4,5

    representing SMO,AS,RU,OS,RRT respectively. Can't check the spellings of VDC and Municipalities. But it is enforced

    that the name of VDC/Municipality should be same as defined

    previously. Hence, the program might throw error if the spelling of

    VDC/Municipality doesn't match as specified.

  • 7/31/2019 Report Intern Final

    26/33

    26

    Conclusion

    The work I made during my internship gave me a lot of knowledge in the fieldof database management, such as data validation and verification, cleaning, analysis,performing queries, along with designing and implemention of forms. The file formatconversion software I developed during this period gave me an insight on variousaspects of programming and MS access database.In review this internship has been an excellent and rewarding experience.

  • 7/31/2019 Report Intern Final

    27/33

    27

    References

    http://exceldatareader.codeplex.com/

    www.whoipd.org

    http://en.wikipedia.org

    http://stackoverflow.com

    http://exceldatareader.codeplex.com/http://www.whoipd.org/http://en.wikipedia.org/http://en.wikipedia.org/http://www.whoipd.org/http://exceldatareader.codeplex.com/
  • 7/31/2019 Report Intern Final

    28/33

    28

    Annexure

    1.Source Code

    //Code to read Excel files.

    private void getExcelData(string file)

    {

    if (file.EndsWith(".xlsx"))

    {

    FileStream stream = File.Open(file, FileMode.Open, FileAccess.Read);

    IExcelDataReader excelReader =ExcelReaderFactory.CreateOpenXmlReader(stream);

    result = excelReader.AsDataSet();

    excelReader.Close();

    }

    if (file.EndsWith(".xls"))

    {

    FileStream stream = File.Open(file, FileMode.Open, FileAccess.Read);IExcelDataReader excelReader =

    ExcelReaderFactory.CreateBinaryReader(stream);

    result = excelReader.AsDataSet();

    excelReader.Close();

    }

    List items = new List();

    for (int i = 0; i < result.Tables.Count; i++)

    items.Add(result.Tables[i].TableName.ToString());

    comboBox1.DataSource = items;

    }

  • 7/31/2019 Report Intern Final

    29/33

    29

    //Code to convert into CSV format

    private void convertToCsv(int ind)

    {string a = "";

    int row_no = 0;

    while (row_no < result.Tables[ind].Rows.Count)

    {

    for (int i = 0; i < result.Tables[ind].Columns.Count; i++)

    {

    if (i == 6 || i == 8){

    }

    else if (i == 2)

    {

    a += result.Tables[ind].Rows[row_no][i].ToString();

    if (result.Tables[ind].Rows[row_no][2].ToString() == "1")

    a += " - SMO";

    else if (result.Tables[ind].Rows[row_no][2].ToString() == "2")

    a += " - AS";

    else if (result.Tables[ind].Rows[row_no][2].ToString() == "3")

    a += " - RU";

    else if (result.Tables[ind].Rows[row_no][2].ToString() == "4")

    a += " - OS";

    else if(result.Tables[ind].Rows[row_no][2].ToString() == "5")

    a += " - RRT";

    a += ",";

    }

    else

    a += result.Tables[ind].Rows[row_no][i].ToString() + ",";

  • 7/31/2019 Report Intern Final

    30/33

    30

    }

    row_no++;

    a += "\n";

    }

    string output = textBox1.Text + "\\" + textBox2.Text + ".csv";

    StreamWriter csv = new StreamWriter(output, false);

    csv.Write(a);

    csv.Close();

    MessageBox.Show(" File Converted");

    txt_browse.Text = "";

    textBox2.Text = "";

    textBox1.Text = "";

    comboBox1.DataSource = null;

    return;

    }

    //Code to insert modify the attribute values and insert into database

    private void button1_Click(object sender, EventArgs e)

    {

    StreamReader sr = new StreamReader(Txt_SelectToInsert.Text);string tex = sr.ReadToEnd();

    char[] text = tex.ToCharArray();

    int count = 0;

    int array_index = 0;

    while(count != text.Length)

    {

    if(Convert.ToString(text[count]) != "\n")

  • 7/31/2019 Report Intern Final

    31/33

    31

    count++;

    else

    {

    array_index++;

    count++;

    }

    }

    string[] db = new string[array_index];

    int j = 0;

    int i = 0;

    while (i < array_index)

    {

    if (Convert.ToString(text[j]) != "\n")

    {

    db[i] += Convert.ToString(text[j]);

    j++;

    }

    else

    {

    i++;

    j++;

    }}

    for (i = 0; i < array_index; i++)

    {

    string constr = @"Provider=Microsoft.Jet.OLEDB.4.0; Data Source = "+

    txt_db.Text +"";

    OleDbConnection conn = new OleDbConnection(constr);

  • 7/31/2019 Report Intern Final

    32/33

    32

    try

    {

    conn.Open();

    OleDbCommand cmd = new OleDbCommand();

    cmd.Connection = conn;

    string dummy_var = Convert.ToString(db[i]);

    char[] final = dummy_var.ToCharArray();

    string enter = "'MSLNEP";

    int x = 0;

    for (x = 0; x < dummy_var.Length-1; x++)

    {

    if (final[x] != Convert.ToChar(","))

    enter += final[x];

    else

    {

    enter += "'";

    enter += ",";

    enter += "'";

    }

    }

    enter += "'";

    cmd.CommandText = "insert into final_tbl values(" + enter + ")";

    cmd.ExecuteNonQuery();

    }

    catch (Exception ex)

    {

    MessageBox.Show("Error try again" + "\n" + ex);

    }

  • 7/31/2019 Report Intern Final

    33/33

    finally

    {

    conn.Close();

    }

    }

    Txt_SelectToInsert.Text = " ";

    txt_db.Text = " ";

    }

    private string District_check(string district)

    {

    switch (district)

    {

    //code for returning district name based on district codes

    default:

    return " ";

    }

    }