THREAT INTELLIGENCE · 2018. 4. 19. · quem s˜ao, agradec¸o a todos pela companhia, momentos bem...

134
UNIVERSIDADE DE LISBOA FACULDADE DE CI ˆ ENCIAS DEPARTAMENTO DE INFORM ´ ATICA THREAT INTELLIGENCE: USING OSINT AND SECURITY METRICS TO ENHANCE SIEM CAPABILITIES Jo˜ ao Paulo Martins Jos´ e Teixeira Alves MESTRADO EM SEGURANC ¸ A INFORM ´ ATICA Dissertac ¸˜ ao orientada por: Prof a . Doutora Ana Lu´ ısa do Carmo Correia Resp´ ıcio e Eng. Pedro da Silva Dias Rodrigues 2017

Transcript of THREAT INTELLIGENCE · 2018. 4. 19. · quem s˜ao, agradec¸o a todos pela companhia, momentos bem...

  • UNIVERSIDADE DE LISBOAFACULDADE DE CIÊNCIAS

    DEPARTAMENTO DE INFORMÁTICA

    THREAT INTELLIGENCE:USING OSINT AND SECURITY METRICS TO ENHANCE SIEM

    CAPABILITIES

    João Paulo Martins José Teixeira Alves

    MESTRADO EM SEGURANÇA INFORMÁTICA

    Dissertação orientada por:Profa. Doutora Ana Luı́sa do Carmo Correia Respı́cio

    e Eng. Pedro da Silva Dias Rodrigues

    2017

  • Agradecimentos

    Antes de expressar a minha gratidão a quem ajudou-me nesta fase da minha etapa académi-ca, quero salientar que caso não sejam mencionados neste documento, é apenas esqueci-mento da minha parte e quero que saibam que estou inteiramente grato. Quem me conhecesabe que não sou o tipo de pessoa que demonstra a sua gratidão por palavras, mas sim poracções.

    Os meus agradecimentos ao projecto DiSIEM, financiado pela Comissão Europeia noprograma H2020, G.A. no 700692, por ter financiado este projecto.

    As primeiras pessoas que quero agradecer são aos meus orientadores, Prof. DoutoraAna Respicio e Eng. Pedro Rodrigues, não só por me terem aceite como seu orientandoe me terem proporcionado esta experiência, como todo o seu apoio, ensino e total dis-ponibilidade sempre que foi necessário, o meu obrigado, pois aprendi muito com ambos.Agradeço a Ivo Rosa e Gonçalo Martins da EDP. Ivo ensinaste-me bastante e ajudastesempre que pudeste, agradeço os nossos debates semanais sobre novas ideias e que po-tenciais cursos este trabalho podia seguir, já o disse e volto a dizer tu és uma fonte deideias e só é preciso estar alguém atento para escutar e pôr em prática, eu tentei ser essapessoa. Não foste meu orientador, no entanto, houve momentos que me orientaste, muitoobrigado! Gonçalo foste a pessoa que me fez sentir logo como membro da equipa. Es-tiveste sempre disponı́vel, mesmo com imenso trabalho, sempre ajudaste-me. Tivestesempre paciência para mim, não sei como, até para responder às mil e uma perguntasdiárias que tinha. Ensinaste-me não só o essencial para o meu trabalho, mas muito maisdo que era necessário, e sei que essa aprendizagem vai ser útil no meu futuro. Estivestesempre atento e preocupado com o meu trabalho e zelaste por mim, obrigando-me a fazeras pausas para descansar e almoçar para repor a energia para voltar ao trabalho. Estougrato pela tua constante preocupação e ajuda.

    Agradeço à equipa do SOC (incluindo o estagiário) por ajudarem-me a monitorizaros incidentes para a tese e a regra criada. Sempre preocupados com o meu trabalhoe se eu precisava de algo. Fizeram-me sentir como membro da equipa, sei que vocêsaprenderam comigo, mas eu aprendi muito com vocês e deram alegria ao trabalho, mesmoem momentos de aflição. Não quero esquecer de todos os outros membros da equipa DSIda EDP (incluindo o colega do Brasil e parceiro de secretária), agradeço-vos a todos pela

    i

  • ajuda, bem estar e momentos divertidos que passei enquanto estive com vocês. Sei queposso contar com vocês no futuro e por essa confiança agradeço.

    Agradeço aos membros do projecto pela ajuda que me deram nas reuniões, ao deba-termos sobre novas ideias para melhorar o trabalho.

    A todos os membros da sala 6.3.34, não quero esquecer-me de ninguém, vocês sabemquem são, agradeço a todos pela companhia, momentos bem passados, onde havia tempopara risadas, discussões “civilizadas” e tempo para ajuda mútua no trabalho. Sei que fiznovas amizades e as antigas apenas fortaleci. Também sei que estas amizades não ficarãoapenas na sala.

    Por último agradeço aos meus pais, avós, resto dos familiares e amigos por estaremsempre a refilar comigo e a tentar combinar jantares para descontrair depois de um dia detrabalho, vocês são maçadores, mas boas pessoas, preocupados sempre com o meu estare com o estado do meu trabalho, agradeço-vos imenso.

    This work is supported by the European Commission through the H2020 programmeunder grant agreement 700692 (DiSIEM).

    ii

  • Dedicado a familiares e amigos

  • Resumo

    Nos últimos anos, face ao aumento em quantidade e em complexidade de ataques in-formáticos contra diversas organizações, tem-se verificado um crescimento elevado no in-vestimento em plataformas de segurança informática nas infra-estruturas das organizações.As equipas com a responsabilidade de garantir a cibersegurança necessitam de moni-torizar um vasto número de dispositivos, utilizadores, aplicações e, consequentemente,eventos de cibersegurança relacionados com esses elementos. A plataforma mais uti-lizada para monitorizar os eventos de segurança informática é o sistema de Gestão eCorrelação de Eventos de Segurança (SIEM, do inglês Security Information and EventManagement). Este sistema agrega toda a informação de segurança proveniente de diver-sas fontes, normaliza-a, enriquece-a e envia-a para uma consola centralizada de gestão.A eficiência e a eficácia das equipas de resposta a incidentes de segurança dependem emgrande medida da capacidade de o sistema produzir uma alarmı́stica detalhada e con-textualizada sobre possı́veis ameaças. Para melhorar essa capacidade é necessário con-jugar indicadores externos relevantes com a informação recolhida na infra-estrutura daorganização.

    Threat Intelligence (TI) é o conhecimento adquirido da conjugação das técnicas derecolha de informação sobre ameaças externas à organização e das técnicas de recolhade informação sobre factores de segurança internos das organizações. É necessário estaratento às fontes públicas de informação de cibersegurança e avaliar a sua qualidade paraobter indicadores fidedignos sobre actividades maliciosas.

    A organização necessita de avaliar o seu nı́vel de cibersegurança para identificaras vulnerabilidades existentes, antes que estas possam ser exploradas por agentes mal-intencionados. Somente com o recurso a fontes de informação, internas e externas, épossı́vel ter uma abordagem TI abrangente e aplicar as medidas de cibersegurança ade-quadas para evitar os ciberataques aos quais a organização possa estar vulnerável.

    Para uma organização estabelecer correctamente o seu nı́vel de cibersegurança, é ne-cessário realizar uma gestão de risco adequada. A gestão de risco é caracterizada portrês etapas, todas interligadas e contı́nuas: análise do risco, avaliação do risco e controlodo risco. No fim do processo, a organização terá um conhecimento credı́vel sobre o seurisco informático, tendo um bom suporte para as tomadas de decisão no que respeita areestruturações e investimentos em segurança informática.

    vii

  • As métricas de segurança são a ferramenta mais indicada para o processo de gestãode risco. Estas ajudam a determinar o estado de cibersegurança no qual a organização seencontra, o desempenho da equipa do Centro de Operações de Segurança (SOC, do inglêsSecurity Operation Center), e o nı́vel de segurança das infra-estruturas da organização.As entidades governamentais e militares foram as primeiras a utilizar as métricas desegurança. No entanto, recentemente, investigadores de diversos tipos de organizações(públicas, privadas e público-privadas), têm investido recursos para melhorar e implemen-tar estas métricas nas suas organizações. Toda esta atenção dada às métricas de segurançadeve-se ao resultado evidente da sua implementação: é possı́vel medir o risco, classificá-loe, finalmente, tomar as contramedidas adequadas para reduzir o impacto de possı́veis cibe-rataques, aumentando a cibersegurança na organização. Contudo é necessário estabeleceros objectivos e o propósito das métricas de segurança. Muitas equipas de cibersegurançacometem o erro de criar métricas que são complexas, fora do contexto, e expressam re-sultados com valores irrealistas. O resultado desta má gestão das métricas de segurança éoposto do pretendido, providenciando má informação e, consequentemente, diminuindoa cibersegurança de uma organização. A visualização dos resultados das métricas é oúltimo passo da criação de métricas e tem como finalidade fornecer informação de umaforma ilustrativa, com recurso a formatos de fácil leitura e compreensão. As visualizaçõesajudam a equipa responsável pela cibersegurança de uma organização a visualizar de ime-diato informações relativas ao nı́vel de cibersegurança dos sistemas e o risco de cada ac-tivo. As visualizações permitem à equipa avaliar e responder, de uma forma quantitativae qualitativa, às perguntas colocadas pela direcção executiva, tais como: qual o nı́vel desegurança, qual o valor de risco na organização, qual o retorno financeiro dos investimen-tos feitos para melhorar a segurança informática na organização ou mesmo para justificara permanência, redução ou aumento de equipamentos e equipas de cibersegurança.

    Para além do mecanismo de descoberta de informação interna, o Open Source Intelli-gence (OSINT) é considerado o mecanismo para a captura de informação externa a partirde fontes online. Com um conjunto de técnicas é possı́vel capturar a informação relevantepara o conhecimento sobre ciberameaças. Existem comunidades de cibersegurança cujoobjectivo é publicar listas com informações sobre novos ciberataques, que normalmentecontêm informações sobre anfitriões suspeitos ou conteúdos maliciosos. Estas listas, aslistas negras, podem ser públicas, quando qualquer pessoa pode aceder à sua informação,ou privadas, restringindo o uso das listas a um determinado grupo ou comunidade. Ape-sar de as listas oferecerem uma informação valiosa sobre ciberameaças actuais, estassem qualquer tipo de pré-processamento, podem gerar um número significativo de fal-sos positivos, devido à ausência de contextualização e alinhamento com a realidade daorganização.

    Este trabalho é dividido por dois tópicos: métricas de segurança e listas negras confiáveis.Para cada tópico são descritas soluções para melhorar o estado de segurança numa organiza-

    viii

  • ção, ao integrar o processo TI em tempo-real no SIEM. Esta integração pode ser mate-rializada na utilização de métricas de segurança para análise do estado de segurança naorganização e fontes de segurança com informação sobre endereços IP suspeitos de activi-dades maliciosas com consideração das operações da equipa do SOC sobre incidentes desegurança, com o recurso a métricas. A utilização directa das listas negras, sem qualquertipo de pré-processamento, resulta num elevado número de falsos positivos, pela ausênciade contextualização e alinhamento com a realidade da organização.

    O trabalho está inserido no projecto DiSIEM e resulta da colaboração de dois dosparceiros do projecto, Faculdade de Ciências da Universidade de Lisboa e EDP - EnergiasDe Portugal, SA. Os objectivos alinham-se com as metas do projecto DISIEM: 1) fornecerinformações OSINT para um sistema SIEM, melhorando a sua detecção e prevenção denovas ameaças; 2) identificar e desenvolver um conjunto de métricas dedicadas à equipade cibersegurança para uma melhor gestão e monitorização dos eventos de segurança paraaumentar o estado de segurança na organização, consequentemente, reduzindo o risco deactividades maliciosas na organização.

    A dissertação apresenta e discute um conjunto de métricas com uma estrutura bemdefinida para serem aplicadas no sistema SIEM. Estas métricas cobrem os sectores degestão, processos e tecnologia, e estão apropriadas para a realidade da equipa de cibersegu-rança. É introduzido protótipos para visualização dos resultados das métricas, incluindodados históricos, possibilitando assim uma avaliação comparativa de eficiência.

    O trabalho propõe uma solução OSINT para aperfeiçoar a alarmı́stica do sistemaSIEM, reduzindo a taxa de falsos positivos, com base na avaliação do nı́vel de confiançaem fontes de informação públicas, e dessa forma contribuir para a eficiência das equipasde cibersegurança nas organizações que usam o sistema SIEM. Esta solução usa listasnegras que identificam endereços de Protocolo de Internet (IP do inglês Internet Proto-col) suspeitos de actividade maliciosa. A informação pode ser sobre sua maliciosidade,o número de denúncias (efectuadas por comunidades ou outras listas negras), númerode ataques aos quais o endereço IP esteve associado, a última vez que foi denunciado,entre outros. As listas negras são úteis para serem utilizadas no sistema SIEM, para amonitorização de comunicações entre a organização e um IP suspeito. Assim, quandohouver um alarme de uma comunicação suspeita, a equipa do SOC pode actuar de formaimediata e analisar os eventos para identificar a máquina, pedir uma análise local e elimi-nar a ameaça, caso seja detectada.

    A solução recolhe informação sobre endereços IP de um conjunto de listas públicas.Os endereços IP e as listas são avaliadas quanto à sua veracidade, com base na correlaçãoda informação recolhida a partir das listas e com base em métricas sobre o resultado dosincidentes associados a comunicações suspeitas entre a organização e endereços IP daslistas. Esta avaliação é realizada de forma constante, sempre que exista uma alteração naslistas públicas ou nos incidentes, para que os seus valores sejam os mais actualizados e

    ix

  • precisos.Foi desenvolvida uma aplicação para administrar as listas negras utilizadas, os endere-

    ços IP, os casos da organização e endereços públicos da organização. São apresentadasregras do SIEM que seleccionam os endereços IP recolhidos das listas negras com base nareputação dada pela avaliação da sua veracidade, para a monitorização de comunicaçõesentre a organização e os endereços IP suspeitos.

    Os resultados mostram que há um aumento de detecção de casos positivos com autilização da solução proposta. Este aumento deve-se ao uso de informação interna dosincidentes, tratados pela equipa do SOC, como parâmetros de avaliação da confiabili-dade das listas negras e dos endereços IP. Dois componentes que se destacam comoparâmetros de avaliação da confiabilidade é o componente da precisão e o componenteda persistência. O componente da precisão tem em conta os resultados da organização eaumenta a confiabilidade de um endereço IP ou de uma lista caso o número de resulta-dos positivos dos casos de incidentes relacionados com o IP seja superior ou número deresultados falsos positivos. A persistência tem em conta a precisão e a denúncia de umendereço IP por parte das listas, para o guardar na nossa lista durante três meses.

    A avaliação da lista negra e do seu conteúdo considerando o ambiente da organizaçãoé uma solução que não foi apresentada por nenhum outro trabalho, e o mais semelhante éo uso de métricas ou recolha de informação com o uso do conceito OSINT, sem avaliaçãodo conteúdo com base na informação da organização. Sendo um trabalho inovador, esteainda se encontra na sua fase primordial. Os resultados do nosso estudo servirão comobase para melhorias e comparação de resultados de estudos posteriores para melhoria naavaliação da confiabilidade das listas públicas e da maliciosidade do seu conteúdo.

    Palavras-chave: métricas de segurança, SIEM, OSINT, listas negras, internet protocol,ciberameaças, threat intelligence

    x

  • Abstract

    Threat Intelligence (TI) is a cyber defence process that combines the use of internal andexternal information discovery mechanisms. The Security Information and Event Man-agement (SIEM) system is the tool typically used to aggregate data from multiple sources,normalize, enrich and send it to a centralized management console, later used by the se-curity operation team (SOC). However, it is necessary to use Security Metrics (SM) tosummarize, calculate and provide valuable information to the SOC team from the largedatasets collected in the SIEM. Although the SM provide valuable information, its erro-neous creation or use could lead to the opposite goal and decreasing the security level, bygenerating false positives.

    Regarding the external information discovery, the information from blacklists is com-monly used to monitor and/or to block external cyberthreats. The blacklists provide intel-ligence about suspicious Internet Protocol (IP) addresses, reported by communities andsecurity organizations. Although the use of blacklists is commonly used to detect suspi-cious communications, it generates a high rate of false positives.

    We introduce a set of security metrics, well-structured and properly defined to be usedwith a SIEM system. We develop a solution with Open-Source Intelligence (OSINT)mechanism to discover and collect suspicious IP from public blacklists, a process to assessthe reputation of the suspicious IP addresses and blacklists, considering the persistence ofthe IP and the organization’s incidents of communications with suspicious IP addresses.The IP are inserted in the SIEM with rules to monitor and aiming at reducing the numberof false positives.

    The preliminary study in a real environment shows that the proposed solution im-proves the security effectiveness of the SIEM’s alerts due the innovations idea of assessingthe IP and blacklists by using the persistence and precision components, and consideringthe organization’s incidents status.

    Keywords: security metrics, SIEM, OSINT, blacklist, internet protocol, cyberthreats,threat intelligence

    xii

  • xiv

  • Contents

    List of Figures xx

    List of Tables xxi

    1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    1.2 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    1.4 Work Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    1.5 Structure of the document . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    2 Context 72.1 Energias de Portugal . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    2.1.1 EDP’s Security Operations Center . . . . . . . . . . . . . . . . . 7

    2.2 Security Information Event Management . . . . . . . . . . . . . . . . . . 9

    2.3 ArcSight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    2.3.1 Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    3 Related Work 193.1 Security Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    3.1.1 Definition and Purpose . . . . . . . . . . . . . . . . . . . . . . . 19

    3.1.2 Gathering and Generating Metrics . . . . . . . . . . . . . . . . . 20

    3.1.3 Categorization, Classification and Taxonomies . . . . . . . . . . 24

    3.1.4 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    3.1.5 Metrics used for threat intelligence . . . . . . . . . . . . . . . . . 28

    3.2 Trustworthy Blacklists . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

    3.2.1 Open Source Intelligence . . . . . . . . . . . . . . . . . . . . . . 29

    3.2.2 The efficacy and trustworthinesses of Blacklists . . . . . . . . . . 29

    3.2.3 Blacklists without trustworthiness . . . . . . . . . . . . . . . . . 30

    3.3 Summary of the chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

    xv

  • 4 Security Metrics for SIEM systems 334.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.2 Taxonomy and Methodology . . . . . . . . . . . . . . . . . . . . . . . . 334.3 Proposed SM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    4.3.1 PETVI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.3.2 ERVIDENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.3.3 TPerf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.3.4 Trustworthiness blacklists’ metrics . . . . . . . . . . . . . . . . . 37

    4.4 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.5 SM solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    5 Trustworthy Blacklists 415.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    5.1.1 Software Requirements & Database . . . . . . . . . . . . . . . . 425.1.2 IP collector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435.1.3 Trustworthiness Assessment . . . . . . . . . . . . . . . . . . . . 465.1.4 Trustworthy Assessment Blacklists Interface . . . . . . . . . . . 50

    5.2 SIEM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525.2.1 BADIP list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525.2.2 SIEM rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525.2.3 SIEM Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

    6 Results 556.1 Preparation & Practical Case Study . . . . . . . . . . . . . . . . . . . . . 566.2 Analysis and Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

    6.2.1 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586.2.2 Analysis of the public Blacklists . . . . . . . . . . . . . . . . . . 616.2.3 IP Addresses assessment . . . . . . . . . . . . . . . . . . . . . . 64

    6.3 Prospective studies and discussion conclusions . . . . . . . . . . . . . . 68

    7 Conclusion & Future Work 71

    References 80

    A DiSIEM - SM survey 83

    B Public Blacklists 101

    C UML for the framework solution 109

    xvi

  • xviii

  • List of Figures

    1.1 Work plan diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    2.1 EDP’s business organization (EDP - Organização dos negócios, 2016) . . 82.2 ArcSight Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.3 ArcSight Console View - retrieved from [11] . . . . . . . . . . . . . . . . 142.4 SIEM rule configuration options . . . . . . . . . . . . . . . . . . . . . . 152.5 ArcSight Dashboards - retrieved from [11] . . . . . . . . . . . . . . . . . 17

    3.1 Two approaches to generate security metrics . . . . . . . . . . . . . . . . 233.2 IBM Taxonomy: Classification of Security Metrics by their Input Types -

    retrieved from [26] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.3 Business-level Security metrics (levels 0 and 1 taxonomy) - retrieved from

    [36] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.4 Security metrics for information security management in the organization

    - retrieved from [36] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.5 Examples of the technique treemap - retrieved from [28] . . . . . . . . . 273.6 Security metrics for information security management in the organization

    - retrieved from [28] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    4.1 Taxonomy for the SM following the Capabilities of the SOC . . . . . . . 344.2 Visualization Prototypes . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    5.1 Workflow of the framework . . . . . . . . . . . . . . . . . . . . . . . . . 435.2 SM displayed in homepage of TABI - 1 . . . . . . . . . . . . . . . . . . 505.3 SM displayed in homepage of TABI - 2 . . . . . . . . . . . . . . . . . . 515.4 The public blacklist’s precision over months (example) . . . . . . . . . . 525.5 SIEM rule configuration options . . . . . . . . . . . . . . . . . . . . . . 54

    6.1 Workflow of the Case study analysis . . . . . . . . . . . . . . . . . . . . 576.2 Comparison of the precision between the lists over December 2016 to

    April 2017 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606.3 BADIP Accuracy in the two scenarios . . . . . . . . . . . . . . . . . . . 616.4 Blacklists initial and final trustworthiness score in each month . . . . . . 62

    xix

  • 6.5 Trustworthiness Assessment of the blacklists over the five month period . 63

    C.1 Database UML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

  • List of Tables

    3.1 Business functions and their purpose - derived from [9] . . . . . . . . . . 243.2 Metrics Categorization - derived from [9] . . . . . . . . . . . . . . . . . 25

    5.1 Combinations of possible presence and the IP’s persistence values . . . . 48

    6.1 Comparison between the lists values of the results of the cases . . . . . . 596.2 IP assessment over the months of the December (2016) and January (2017) 656.3 IP assessment over the months of the February, March and April of 2017 . 66

    B.1 Public Blacklists and their information . . . . . . . . . . . . . . . . . . . 101

    xxi

  • Chapter 1

    Introduction

    In recent years, due to the increase in the number and complexity of cyberattacks againstorganizations, there has been an increase in the investment in Information Technology(IT) security solutions in the organizations’ infrastructures. Teams responsible for the or-ganization’s cybersecurity need to monitor a vast number of devices, users, applicationsand, consequently, cybersecurity events related to these elements. The typical platformused to monitor those events is the Security Information and Event Management (SIEM)system. This system aggregates all the information about cybersecurity events from vari-ous sources, normalizes it, enriches it and sends it to a centralized management console.The effectiveness of the cybersecurity incident response team depends on the capabilityof the SIEM to produce detailed and contextualized alarms for possible threats, and theuse of SM that can evaluate the security degree of the organization and the performanceof the SOC team. To improve this capacity, it is necessary to combine relevant externalindicators with the information gathered in the organization’s infrastructure, and structureSM that are suitable for the SOC capabilities.

    Threat Intelligence (TI) is the process of extracting information about cyberthreatsfrom diverse sources (internal and external). It is necessary to be aware of the Internetcybersecurity information sources, to obtain reliable indicators about cyberthreats - ex-ternal source - and extract knowledge about the organization’s security status, in order toidentify the vulnerabilities that can be exploited by an attacker - internal source. Onlywith the combination of both sources, external and internal, it is possible to have a thor-ough TI approach and apply the security measures to reduce the risk of cyberattacks, thusenhancing the security status of the organization.

    Security Metrics (SM) are used to assess the security status, the performance of theSecurity Operation Center (SOC) team, and the security and health of the infrastructuresin the organization. In the areas of science, the term ’metric’ is used over 200 years.Although in the decade of 1960, SM were already investigated and implemented by thegovernment [38], only in recent years, they are getting more attention for improvementsand implementations by researchers from all types of organizations (private, public, mil-

    1

  • Chapter 1. Introduction 2

    itary, and more). In addition to provide knowledge about the weakness and flaws withinthe organization (security status), the performance and work done by the information se-curity team and cybersecurity appliances, SM also provide relevant indicators about ma-licious threats. SM prove that their usage significantly enhances the risk’s measurement,thus providing information about the vulnerable assets, the dependencies between them,and the most critical sectors within the organization [23, 33, 39]. The C-level managers,can use this information to make well-supported decisions in cybersecurity strategies forcounter measures to reduce the impact of cyberthreats. Therefore, SM can enhance theorganization security status [23, 33, 39].

    If the SM are used for the internal source then the sources commonly used to retrieveexternal information about the current existing cyberthreats are known as cybersecurityfeeds, especially the blacklists. Blacklists are lists containing information about suspi-cious hosts or malicious contents. This work uses blacklists that identify Internet Protocol(IP) addresses. The lists can be public, i.e. anyone can retrieve information from themor private, restrict lists to be used by a particular group or community. The informationcan be about their maliciousness (botnet, phishing, ransomware, or DoS), the number ofreports (by user’s communities or other blacklists), number of attacks, last time reported,and more. Although the typical approach is to block the communications with these sus-picious IP addresses, this approach does not consider the probability of a machine alreadybeing infected, and only prevents the malicious communication. The blacklists are helpfulfor the SOC team to monitor communications between the organization and a suspiciousIP. When there is an alarm, the Security Operation Center team (SOC) can take immediateaction and analyse the asset to detect and eliminate the infections.

    This work is part of the Diversity enhancements for Security Information and EventManagement (DiSIEM) project [12] and was implemented in collaboration with two ofthe organizations that form the consortium: Faculty of Science of the University of Lisbonand EDP - Energias de Portugal, SA.

    “If you know the enemy and know yourself, you need not fear the result of a hundredbattles.”, [40]

    1.1 Motivation

    As the cyber information security team is getting more assets to manage and secure, arisesthe need to create Security Metrics to measure all the security environment. These SMshould cover all the levels, starting from the technical/operational view and reaching to theC-level security manager. There are fundamental questions that should be questioned andanswered when creating and implementing security metrics, such as which SM should beused, what is the proper raw data to feed the SM, what to extract and display from the SM

  • Chapter 1. Introduction 3

    and how to incorporate with the collector system current in use. Although SIEM systemsare built with predefined SM, it is crucial for the safety and security of the organizationto implement into the SIEM system or with information provided by it, custom metricstailored to the environment, context and objectives of the organization.

    The Internet is a vast source of information and can provide knowledge to the or-ganization about cyberthreats. But how to use it, where to use it, what to do with thatinformation, how to implement in a manner without compromised or modifying com-pletely the workflow and technologies used (such as the SIEM system), are issues to beresolved. And how trustworthy is the gathered information? In addition to gather the in-formation is necessary to classify their trustworthiness and reliability, accordingly to theorganization’s environment and security status.

    Due to the potential valuable knowledge of cyberthreats which the Internet and theshare information between organizations can offer, the information security teams alreadyuse private lists. Although the public lists produce a significant number of false positives,the teams are starting to use it, yet only use the public lists that have a certain level of trust.This level of trust is from the team’s experience in using the lists and the lists reputationin the cybersecurity communities.

    As referred in the related work, no work was found about classifying a set of publiclists and their content with insight information about the organization’s security status andapplying that knowledge to calculate the trustworthiness of the credibility of a blacklistand trustworthiness of the suspicious maliciousness of an IP address. The organizationshould have, on their side, a method or a system capable of receiving information frommultiple sources and classify them accordingly with the correlation between those sourcesand the environment of the organization, providing an output list with a reputation score,more accurate, reliable and suitable for the organization’s reality.

    1.2 Goals

    The work presented in this dissertation follows two different, but interrelated approachesto enhance the security status of an organization by improving the SIEM capabilities. Toachieve that, two objectives are set: 1) to establish a set of adequate Security Metricsto be applied within the SIEM system; 2) to develop a solution to gather informationfrom public lists, classify its content trustworthiness considering external and internalinformation, monitor the organization’s communications and reduce the number of falsepositives without decreasing the true positives cases rate.

    The objectives are within two goals of the DISIEM’s project: 1) providing OSINTinformation into the SIEM system, improving in the detection and prevention of newthreats; 2) Defining a set of metrics every type of SOC and dedicated the SOC capabilitiesand Security Information management to monitor all the personal and infrastructure active

  • Chapter 1. Introduction 4

    in the organization’s security information.

    The SM will increase the organization’s awareness relative to their security status,and consequently, augment the knowledge about the risk within the organization. TheOSINT solution will be capable of gathering information from public blacklists reportingsuspicious IP addresses, assess the information with metrics to provide a reliable and atrustworthy output with a reputation score to be used with the SIEM rules. Enhancing theSIEM capabilities in monitoring the network communications of the organization.

    1.3 Contributions

    This work offers three main contributions: 1) a set of well structured and categorizedsecurity metrics; 2) new visualization prototypes to visualize the information from SM;3) a new framework for gathering, assess and manage public blacklist using external andinternal information.

    As result of the project, EDP now includes OSINT in their monitoring process, fromblacklists and security metrics to improve the efficiency of the SOC team. They startedby creating their own public-private blacklist containing public and private information,including the results of alert investigation to assess the effectiveness of their sources. Theincident response procedures were adapted to provide feedback so that each alert couldbe categorized as true positive and false positive. The introduction of OSINT, combinedwith continuous improvement of the Security Incident Management process, allowed theincrease in the rate of malware detection while also reducing the number of false alerts,making the operations more effective.

    The framework of the third contribution is divided into modules, and each module isindependent from each other. The organization can choose the module more suitable fortheir status and priorities. 1) A program to gather suspicious IP from a set of blacklists.2) An assessment program to analyse and classify over time the trustworthiness of thegathered IP addresses and blacklists. 3) A set of example rules to be used by a SIEM toreduce the false positives. 4) A management interface enabling the end-user to manageand monitor the blacklist, suspicious IP, cases related with suspicious communication andthe organization’s public IP from a graphical web interface.

    As result of the contributions of this work, the article ”Threat Intelligence: Usandoinformação sobre IP maliciosos para melhorar a eficácia de um sistema SIEM” was writ-ten and submitted to the Portuguese conference INForum [22] and accepted in the pro-ceedings for publication.

  • Chapter 1. Introduction 5

    1.4 Work Plan

    The work began on the 30th of September 2016 and supposedly should have ended onthe 30th of June 2017. However, due to the addition of objectives and the occurrence ofunexpected issues, the work lasted until the month of August 2017. In this section theinitial plan, the additional plan and the accomplished plan are described.

    It is noteworthy that in mid-December there was a change of the initial plan with theaddition of a new theme that would be the second component of this work: trustworthinessblacklists. This new module came with the need of the organization involved (EDP) to,at the time, searching for solutions to collect information from various public blacklistsand using metrics to assess the content of these blacklists to reduce the number of falsepositives that the blacklist have a reputation for. The research and study on related workand the topic of SM had already been carried out, however this stage was extended tothe study on blacklists, their collection, and their content evaluation and insertion into theSIEM systems to be monitored. With the combination of the two, i.e. Security Metricsand Trustworthiness blacklists, now the work was not only focused on SM for SIEMsystems, but also on threat intelligence.

    The work was completed in August with the writing of this dissertation, extending foranother two months of its initial plan.

    Figure 1.1 displays the vision of the course of this work with the initial (included theadditional plan) and the accomplished plan.

    Figure 1.1: Work plan diagram

  • Chapter 1. Introduction 6

    1.5 Structure of the document

    The remainder of this document will be organized as follows. The next chapter introducesthe context of this work. The chapter is divided in two topics EDP and SIEM system. TheEDP’s business and the SOC team are the focus of the first topic and for the SIEM’s topica detailed description about the SIEM used in EDP - HPE ArcSight - its architecture andfeatures. Chapter 3 reviews the related work and discusses the current state of the art anda view about the areas of security metrics and trustworthiness of the blacklists. The chal-lenges of the SM concept, how practical SM can be for the SOC and its implementationwith SIEM systems. Chapter 4 is the developed work about Security Metrics for SIEMsystems. This chapter presents the proposed metrics, taxonomies, principles and visual-izations. These security metrics will help the information security team to manage theirinfrastructures and work flow, and the C-level managers to manage the SOC team and theorganization’s security resources. Chapter 5 - Trustworthy Blacklists focus in describingthe developed framework of gathering public information from a set of public sources,the assessment of the information previously collected, and a web interface to manageand visualize results of the framework. In addition, the chapter provides guidelines tocreate rules to be used in a SIEM system to monitor and alert suspicious communica-tions between the organizations and suspicious IP of maliciousness. Chapter 6 describeshow the experiment was prepared, the environment it was submersed, and the analysis ofthe experiment’s results. The document ends with a summary, conclusions of the resultsanalysis and future research and developments to improve our work.

  • Chapter 2

    Context

    2.1 Energias de Portugal

    EDP - Energias De Portugal, SA. is considered one major electricity operator in Europe.It is also one of Portugal’s largest business group, the company was founded in 1976after a fusion of 13 companies, and was the first Iberian company to own significantgenerating and distribution assets in both sides of the border. Currently EDP is the thirdIberian major operator of renewable energies and one of the world’s largest players inwind energy [13, 14].

    Forbes Global 2000 magazine ranked EDP at position 437 in 2016 and is worth around2.15 billion Euros, by a study conducted by consulting ”Brand Finance”, published inJune 2016 [17].

    Figure 2.1 presents the EDP’s business and displays how complex the universe of EDPit is. EDP operates in three countries: Portugal, Spain and Brazil. In each country theyhave businesses in electricity production, electricity and gas distribution, commercializa-tion and trading of electricity and gas.

    2.1.1 EDP’s Security Operations Center

    EDP’s SOC uses the typical components to enhance security and to reduce the risk. Fire-walls, antivirus, and IPS are some of the components used. To link all the informationprovided by these cybersecurity appliances by monitoring security events, EDP uses theArcSigh SIEM from Hewlett-Packard Enterprise (HPE) [21]. They also do awarenessand countermeasure procedures to internal collaborators in the presence of cyber-threats.The SOC already uses SM to view the state of their tasks, to know the status of their sys-tems and components and to monitor the number of incidents and vulnerabilities withinthe managed infrastructure. Periodic reports are produced with graphics about the secu-rity status of the company and applications to present to the C-level managers and theexecutive board.

    EDP’s SOC uses ArcSight’s SIEM to monitor, manage (create, edit, delete) incidents,

    7

  • Chapter 2. Context 8

    Figure 2.1: EDP’s business organization (EDP - Organização dos negócios, 2016)

    create metrics, investigate possible security incidents, manage the devices, forensics andmore.

    The SOC’s team thinks that the SIEM is not fully to its potential in SM and coun-termeasures, and there are some flaws on ArcSight SIEM. Although it has predefinedsecurity metrics and respective visualizations, the SIEM is limited concerning the cre-ation of new visualizations making to improve the SIEMs plataform. These queries areinterpreted as metrics, they will get measures using filters and other queries, then trans-form those measures into meaningful data for visualization. When a modification occursand a query needs a simple modification, the work needed to perform, required a signifi-cant labour time, due to the imminent recreation of the query. All the dependencies fromthe queries associated with, and the visualization itself needs to be created from scratch.Another weakness in the ArcSight SIEM system, it is in the inflexibility of changing theclose date of an event. When a member of the SOC team resolves an incident, and setsits date from open to close in the ArcSight, the value of the close date will be the cur-rent timestamp and cannot be modified. The value can be incorrect because the incidentcan officially be closed hours or even days before it was declared in the SIEM. Theseresults are poor measures for metrics and reports. To bypass these two flaws of the Arc-Sight, EDP created an application external to the SIEM (internal in the network). Theapplication is used to create graphics and uses other source besides the SIEM, providing

  • Chapter 2. Context 9

    measures to be used by the SM, improving the accuracy of the results. Although the SOCteam already thought about using OSINT technology to feed the SIEM, it was never fullyimplemented due the reputation of high false positive rate that the sources provide and theinflexibility of the SIEM.

    Our solution aims to help EDP’s SOC team to overcome these drawbacks, by develop-ing and implementing new SM for the SIEM. These new SM will show information andstatus reports about the security and efficiency of the system, the reliability of the sourcesand support for the decision making. New visualization methods will be produced topresent these SM. It will be developed a framework to gather suspicious IP addressesfrom public blacklists, by using the OSINT concept, assess the IP and the blacklists, us-ing a correlation of the information collected and using the results of the company securityincidents. The assessment will produce a more reliable output and with SIEM’s rules wewill monitor the communications between the assets of the company and the suspicious IPaddresses. The conjunction of all the components of the solution will enhance the EDP’sknowledge of their security status, increase the SOC’s efficiency, and reduce the companysecurity risk.

    2.2 Security Information Event Management

    A Security Information and Event Management (SIEM) system is a tool which combinesthe services of Security Information Management (SIM) and Security Event Management(SEM). Scott [19] states that the purpose of a SIEM is to gather and manage event logdata. It collects and aggregates data to provide an effective and beneficial analysis ca-pabilities for the information security team. With SIEM systems, the tasks of securitymanagers - monitoring, incident response, reporting, investigating and auditing - will bemore efficient, fast and accurate, due to the combination of SIM and SEM purposes.

    The SIEM has six core functions: 1) Collects data from devices and from differenttypes; 2) Normalizes all the data collected from the different vendors and devices to acommon standard; 3) Enriches the event data gathered with taxonomies, network andassets with specific details; 4) Stores logs and events, and through a high compressionratio stores information of several years; 5) Searches all the information gathered witha simple interface and using a text tool; 6) Analyses all the gathered data in real time,identifies and traces data patterns to find threats and/or breaches.

    2.3 ArcSight

    The ArcSight is a HPE SIEM product and is used by the EDP SOC team. ArcSight SIEMencapsulates all features from a normal SIEM. ArcSight consists of three componentsthat make its architecture: Connectors, Loggers and Enterprise Security Management

  • Chapter 2. Context 10

    (ESM). In addition to these components it is also provided a graphical interface for themanagement. From the interface, it is possible to monitor, analyse and filter the datapreviously collected and processed. Creating custom filters and rules, having more thanone active channel for each incident and automatic creation of charts are a few featuresthat the ArcSight SIEM system offers.

    2.3.1 Components

    Each component of the Arcsight has a predefined task. The connectors collect, normalizeand categorize all sources’ information. The ESM and logger correlate and consolidatethe information and display it to the user. The main difference is the storage capability,the logger has more space (providing a larger window’s time of information), and thecorrelation engine. EDP decided six months of raw information for the Logger’s storageand three months of filtered information for the ESM’s storage. The flux of the SIEM’sprocess, starting from the sources and ending in the logger and ESM user interface inrepresented in Fig. 2.2.

    Figure 2.2: ArcSight Architecture

    Connectors

    A connector is an ArcSight software component whose purpose is to collect all the events,from a variety of sources, and forward to ArcSight destination components.

    A connector is installed as an appliance or as a virtual machine, and collects the eventsfrom the logs of each connected device. The source device can be an IDS, Firewalls,

  • Chapter 2. Context 11

    Databases, antivirus, operating system’s logs, and more.

    In the second phase, the connector normalizes all the distinct data to the CommonEvent Format (CEF). Each source device has its own standard log, therefore there is anextensive amount of information differently formatted and this step solves the differenti-ation between vendors.

    The following phase is the filtering process, when the connector discards the unneces-sary data previously collected. The filtering is needed because the connector cannot filteruseful information on some devices when it is collecting (for example: Windows logs).Then all the similar information is aggregated into groups, providing a faster search, thusimproving the performance. This last phase is called the aggregation phase.

    To enrich each event with substantial information, ArcSight uses six criteria: the targetobject; the behaviour associated with event; the outcome - success, failure or attempt - thetype of event (according with the security domain); the device group; and the sixth isthe event’s significance to separate normal events from hostile events. The connectorconcludes its process by sending all the essential information to the HP Arcsight Loggerand the HP ArcSight ESM.

    The Connector Appliance centralizes connector management and offers unified con-trol of all the available connectors (a connector can be installed but not available, due todeactivation or malfunction). ArcSight Connector Appliance provides a single interfacethrough which is possible to configure, monitor, tune, and update. This is desirable whenthe organization has a significant number of connectors. A connector’s appliance cancluster operations and send all of them across to the connectors.

    The SmartConnector is the default connector. It’s an ArcSight software component,which collects events and logs from all its connected sources. In addition to normalconnector features, the SmartConnector grants the possibility to add, remove and edit asmart connector, update the connector’s table parameters, add and remove destinations,edit destinations parameter and send commands to a connector.

    The ArcSight provides several types of SmartConnectors. Each of them contain aparticular functionality. This dissertation only describes the FlexConnector, because ofits ability to connect third-party devices.

    FlexConnectors are custom Connectors that can read and parse information from thethird-party devices and enrich the ArcSight’s event standard. Some third-party devicesdo not have a log format known by a SmartConnector, hence it is necessary the use ofFlexConnectors. Connector Appliance provides a development framework that lets thesecurity team quickly and easily develop a FlexConnector, enabling test phases beforedeploying it. A security team member develops a FlexConnector by creating a parser filecompatible with the target sources.

  • Chapter 2. Context 12

    ArcSight Logger

    The ArcSight Logger is a universal log management solution [20], which has an extremehigh event throughput, efficiency in the long-term storage, and agile data analysis. Arc-Sight Logger collects logs and events of raw data from any logger generator source andstorage a large quantity of logs in a simple management manner. Supports cybersecu-rity, IT operations and log analytic with quick searches and reports about the data or theinvestigated incidents.

    The Logger also provides a web interface where its features can be used, and thesecurity team can analyse and investigate the events. The Logger display those events ina tabular form, with fields that describe how the Logger received the respective event.

    ArcSight Enterprise Security Management (ESM)

    ArcSight Enterprise Security Management (ESM) is a software solution providing secu-rity event monitoring with network intelligence, correlation, anomaly detection, historicalanalysis tools, and automated remediation.

    The ESM connects all the previous components for correlation of all the events col-lected and has flexible monitoring tools to investigate and remediate. It uses a workflowframework providing a structure of escalation level ensuring that events of interest willarrive to the security team members and in the right timeframe.

    The ESM offers an automatic reporting tool, requiring a template document and indi-cation of the fields to be filled with the ESM values. The template is uploaded to the SIEMand the security manager defines the creation date and the type for the report (monthly,quarterly, or another defined period).

    The ESM uses other SIEM components and has its own sub-component to fulfil itstask.

    • SmartConnector: and their sub-classes (e.g. FlexConnector);

    • Management centers: for a centralizing management of the connectors;

    • Correlation Optimized Retention and Retrieval (CORR): Engine which performshigh speed searches and process events with high rate;

    • Data sources: all the sources connected to the connectors;

    • ArcSight Manager: is considered the heart of the solution of ArcSight SIEM. per-forms analysis, correlation, workflow and services;

    • User Interfaces:

    – ArcSight command center for all the manageable data, user, devices and ser-vices - not used frequently;

  • Chapter 2. Context 13

    – ArcSight Console to be used all the time by the SOC team for the daily tasks,using the ESM resources;

    • Use Cases to view, configure, and transport developed sets of related resourceswhich address a security issue;

    • ArcSight Risk Insight is an add-on product that aims at providing information aboutthe business impact of real-time threats to assets;

    • Interactive Discovery is a separate software application that enhances the visualiza-tion (with dashboards, reports, and analytic graphics), data discovery and investi-gation of security data from the ArcSight platform.

    Figure 2.3 is the ArcSight Console and displays some of the ESM resources. Apartfrom the multiple features in the top panel, this console can be divided in three main pan-els. The left panel is the navigator panel and it is where the active channels are displayed,organized by folders, and can be stored to be used in the future (the security managercan also choose to see the rules, case users, data monitors, as a drop-down list). In theright side is the inspect/edit panel where by selecting an event all the gathered informa-tion about that event is presented. The middle panel is divided into four sections, startingfrom the top and going down until the forth section. In the first section, there are six openactive channels and presents a certain type of real-time events. The SIEM is filtering theevents using the parameters given from the filters and/or rules to obtain the specific typeof events. The ”Live” active channel is the channel currently selected. The second sectionis a summary about the active channel selected, this window presents the date and time(”start” and ”end”) that the active channel is getting the information, the used filters, thetotal number of matched events and the number of events divided by their severity. Thethird section is the radar active channel and shows the events and their severity over thedefined time, where is possible to select a specific time frame. The last window is wherethe events are displayed, here the security manager can add tabs to know more informa-tion about the events. The console displays the ”Severity level”, ”End time”, ”Name” and”Attackers Address”.

    ArcSight ESM Resources

    The SOC team uses the resources of the ArcSight Console as a support when analyzing,investigating and monitoring security events.

    Although the ArcSight offers twenty-six useful resources (active channels, field sets,active lists, agents, assets, categories, locations, networks, vulnerabilities, zones, cases,customers, dashboards, patterns, reports, archives, rules, stages, users, data monitors,filters, knowledge base, notifications, partitions, patterns discovery and profiles) in thisdocument the most relevant resources will be described.

  • Chapter 2. Context 14

    Figure 2.3: ArcSight Console View - retrieved from [11]

    The active channels belong to the monitor view category and are real-time collectionof events defined by parameters (filters, rules and date) created by the SOC analyst. Activechannels contain two sub active channels: header and radar. The active channel headerappears at the top of every single active channel and contains the statistical overview of thechannel and the events passing through it. The active channel radar is a bar chart overviewof events in the active channel. The events are sorted into segments by the event’s endtime. The grid view displays each event with a set of a data fields in a table view. The datafields are information about the events (severity, attack address, target address, etc.) andcan be added or removed accordingly to what the security analyst wants to be displayedwith the events. These three views are presented in the middle section of the Fig. 2.3.

    Because we are going to use rules in the Trustworthy Blacklists component, we de-scribe the concept of a SIEM rule in more detail that the other SIEM resources. A ruleis a programmed procedure that evaluates incoming events for specific conditions andpatterns, when there is a match it triggers actions in response. Helping the analysing andmonitoring specific type of events. Figure 2.4 displays the available options when creatingor editing a rule.

    Figure 2.4a displays the basic options for a rule, such as the name for the rule, itsdescription and the groups that will be notified by the rule. Figure 2.4b exhibits theconditions option, these conditions can be basic conditions, i.e condition also used byfilters or active channels (these condition can be for example the target address, targethostname, attacker hostname, attacker port) or have the combination of filters, active lists

  • Chapter 2. Context 15

    (a) Attributes option (b) Condition option

    (c) Aggregation option (d) Action option

    Figure 2.4: SIEM rule configuration options

    (static or dynamic), association with only a set of assets and create a relation betweenthe asset and the vulnerabilities known by the SIEM. Figure 2.4c is the third option iswhere it is set the aggregation events to trigger a rule. Here is where we define the eventscharacteristics to trigger a rule. The options available are the number of matching eventsrequired to trigger a rule, the interval time to occur that match, the identical and distinctevent’s fields that are required to be considered a matching event. The final option is theAction option (Fig. 2.4d), this option set the action that the rule will preform when istriggered. The normal options can be a notification to the SOC, or a configuration overthe SIEM, such as creating a new list from the result of the events.

    The filters are a set of conditions that focus on an individual attribute of the event.With filters, SIEM reduces the number of events processed by the system. Filters alsohelp analysing and monitoring some specific type of events in the correlation with rules

  • Chapter 2. Context 16

    and data monitors.When some conditions of Arcsight are triggered a notification is created. The noti-

    fications support the SOC team to monitor and to be alert on events, each notificationcontains the destination resource. The destination resource is the mechanism by whicha security team member can add to an individual user or groups in the organization toreceive a specific type of notification. The notification messages can be automatic anddelivered by e-mail, text message, or by the ArcSight Console.

    The Dashboards display indicators that communicate the state of the organization.Dashboards are made up of individual data monitors in a variety of graphical and tabularformats. To build a dashboard it is necessary to create queries. A query contains parame-ters, these parameters act like filters and select the essential information. The queries canhave dependencies between them, is required to declare these dependencies and selectthem for the expected dashboard.

    Figure 2.5 displays examples of dashboards with some predefined metrics. One of thedashboards is the ‘Top categories’, a bar chart type that shows categories of events and sortevents by the number of times they match a rule. The “unknown” category contains theevents for which the SIEM couldn’t detect the name and categorization of the event. Onthe right side of the window, a pie chart displays the top target addresses. A query countsthe number of times an address is considered as a target, and the results are presented in apie chart. This pie chart only displays a visual distinction, by colours, about the differentIP addresses. ArcSight can provide visualization of other metrics in addition to thoseillustrated in Fig. 2.5, such as the type of firewall rules triggered or the number of alertsby a rule.

  • Figure 2.5: ArcSight Dashboards - retrieved from [11]

  • Chapter 2. Context 18

  • Chapter 3

    Related Work

    The use of Threat Intelligence in the organization is indispensable, nowadays, due to thevaluable knowledge extraction that the information security team can obtain, and conse-quently, a better efficiency and time reduction in response to security incidents. Bromiley[5] defines TI as a fundamental process for the organization’s information security de-fence. TI uses two factors for information discovery: external and internal. The externalis characterized by the discovery of threats outside the organization, provided by feeds(social networks, blogs, forums, security communities or paid subscriptions), informationsharing by government, police forces and security organizations or organizations of thesame sector or geographically closer. As for internal discovery, the goal is to gain detailedknowledge about the level of security of the organization. Detecting system vulnerabili-ties, monitoring and detecting security anomalies, and deviations from normal behaviourare aspects which help to know the organization’s security status. The following sec-tions review works related with these two topics, their theories, developments, results andconclusions. They will be the foundation and initial principals for the developed work.

    3.1 Security Metrics

    How can we defend ourselves if we do not know our own weaknesses? Notwithstand-ing the importance of knowing the outside threats, all that awareness is insufficient if wedo not know about our security status. Security metrics are the solution to do an effi-cient, precise and objective internal discovery. All the types of organizations (academic,government, and companies) are studying the SM to provide more precise and completeinformation about the systems security and risk status of the organization.

    3.1.1 Definition and Purpose

    One of the problems related to the development Security Metrics is the ability to incor-rectly define SM. A bad metric definition leads to misinterpretations, which originateinappropriate evaluation and by consequence a wrong risk assessment. Therefore, instead

    19

  • Chapter 3. Related Work 20

    of an improvement of security and a risk reduction, the opposite is obtained. However,defining SM alone may not help in deciding whether the metric that security team choosesis a SM, and whether it responds to the purpose of the organization. Using both (definitionand purpose) is the most proper option to obtain the desirable results.

    Jansen [23] and Jaquith [24] state that the definition of Security Metrics is the mea-surement based on quantifiable measures and is a manner to put numbers around activitiesof security information. SM are a subsection of metrics and specify which quantifiablemeasures must be security-related, maintaining linearity and the methods well defined.Payne in [33] goes further and separate measurements from metrics, saying that mea-surements are raw data collected and metrics are either objective or subjective humaninterpretation over measurements, but always simple and precise. Metrics can be an ef-ficient tool for security managers to notice the effectiveness of their security programsand their components. With the knowledge gathered through metrics, security managerscan answer questions such as, “are we more security today that we were before?” or “arewe secure enough?” or even “how secure are we?”. Others author link Security Metricswith measuring risk levels and countermeasure decision-making. Julisch [26], and Kauran Jones [27] define SM as valid and precise functions, whose return values are inverselyrelated to the vulnerability of the measured system. SM are tools to identify the adequacyof controls, to provide a baseline for comparison purposes, to evaluate the security built,and provide financial information. This management makes better information securitydecisions. In the same work Julish, also proves consistence of this definition in the fieldof software quality metrics.

    Jansen, Jaquith and Payne have the same concept of SM’s purpose, described in [23],[24] and [33], respectively. Them, Muthukrishnan and Palaniappan [31], Rathbun [34],Tashi and Ghernaouti-Helie [39], argue that Security Metrics as vital role to any organi-zation. The SM’s purpose is to provide an understand about the security risks, to discoverpotential problems in the system, detect failures in the IT controls, weakness of the secu-rity infrastructure, measure the performance of countermeasure and process, facilitatingthe decision-making. In addition, SM strive to offer a quantitative and objective basisfor security assurance for strategic support, quality assurance, and tactical oversight, alsoprovides more information for the assets’ accountability, These criteria can be achievedwith models and algorithms which are applied to a collection of measured data.

    3.1.2 Gathering and Generating Metrics

    As will be explained, there are two methods to generate metrics. It will be used the top-down approach to explain the related work of gathering and generating metric.

  • Chapter 3. Related Work 21

    Raw Data

    There are many devices that can provide useful raw data to the SM, yet where, how andwhat can be challenging questions. These same devices can also give wrong informationso is necessary to take care and exactly know how to answer these three “simple” ques-tions. The correct answers will help to discard the unnecessary and unusable raw data forSM and if it is not possible to gather the right information discard too complex and notfeasible SM.

    The works of Berinato [4] and Vaarandi and Pihelgas [41] answer these questions.The use of network scans to find devices and have better understanding of the network’sstructure provides network coverage. As Berinato states in [4], the network discoveryis an optimal tool to use and the raw data will provide good security metric. To extractvaluable information from the logs, the work in [41] is explained the necessity to filterand remove the duplicates, reducing the large amounts of duplicate and unnecessary rawdata collected. The normalization is necessary when the organization has different typesof logs. The correlation between the logs will also provide credible and more completedata for the SM. The authors work goes further and each process is to explain how ben-eficial are the logs and understanding what information each log provides, knowing whatraw data the SM needs and which logs provide that data, it some fundamental criteria andhelps in the selection of desirable devices. Such as the logs of IDS and detect their falsepositive if an important data to determine the flaws or wrong configuration in the IDS.Therefore, SM can also be helpful for collecting correct raw data. The SIEM alreadycollects, normalizes and correlates the logs. However, is necessary to define which logsshould be feeding the SIEM, to not have a considerable volume of unnecessary informa-tion.

    Good vs Bad Metric

    Having valuable raw data is important, yet if the generation and selection of metrics is notdone with care, all the raw data collected will be useless and meaningless SM. First is toknow how to differentiate a good metrics from a bad metric. Jaquith [24] describes a listof good metrics and bad metrics, so the security manager can check which side his metricsbelongs to. Good metrics should satisfy five criteria: 1) Consistently measured, withoutsubjective criteria; 2) Cheap to gather, preferably in an automated way; 3) Expressed as acardinal number or percentage, not in a qualitative label like “high”, “medium” and “low”;4) Expressed using at least one unit of measure, such as “defects”, “hours”, or “dollars”; 5)Contextually specific, and relevant enough to decision-makers that they can take action.As for bad metrics, in the same work, Jaquith considers those that are inconsistentlymeasured, usually because they rely on subjective judgements that vary from person toperson, cannot be gathered cheaply, as is typical of labour-intensive surveys and one-

  • Chapter 3. Related Work 22

    off spreadsheets, and do not express results with cardinal numbers and units of measure,instead, they rely on qualitative high/medium/low ratings, traffic lights, and letter grades.

    Payne [33] uses an acronym for security managers to know if a metrics is good touse: “Good metrics are those that are SMART, i.e. Specific, Measurable, Attainable,Repeatable, and Time-dependent”. SM that are SMART indicate the degree to which thesystem is from the security goals. Building a SM program can be difficult and sometimesis possible to deviate from the objective.

    Proprieties to select SM

    To select and generate metrics is necessary to establish some proprieties. Otherwise, willbe created metrics which are not the focus of the work and don’t give valuable informa-tion. Rathbun [34] describes that all SM that answer a question nobody is asking are to bediscarded. Here again is necessary to understand the organization’s security objectives.SM provide decision support and nothing else.

    In the works of Rathbun [34] and Tashi and Ghernaouti-Helie [39] is explained thatgathering SM can be efficient and easier if some simple questions are correctly answered:what data is to gather? Why gathering this kind of data? How to collect the data (pro-grams, logs, etc.)? When to collect the data (frequency)? And where to collect (whichdevices/assets to tap)? The answers to these questions will give a good foundation (who-ever not a technical or direct) guide about metrics. This “guide” can be also used for SM.The security manager needs to know the organization’s security objectives and which de-partments will the collected metrics be presented to. If he wants to demonstrate a financialaspect for the board and executive, financial metrics are requested, if is for the operationteam, technical measurements are needed. If the responsive team doesn’t know how tomeasure, the probability of the final results with wrong values will be high. Therefore, themain goal is to obtain reliable and understandable measurements, selecting only what isreally important and is according with the organizations’ security objectives. Vaughn et al.[42] also agrees with this. They state that governmental metrics should be addressed forupward reporting and organizational report. As for the commercial side, their metrics aremore focused to answer questions about how strong is the security perimeter, what is thereturn of the investment (ROI), etc. Last, but not least, Jansen [23] specifies five mattersthat should be in mind when selecting metrics, which are: Correctness and Effective-ness, Leading Versus Lagging Indicators, Organizational Security Objectives, Qualitativeand Quantitative Properties, Measurements of the Large Versus the Small. All these pro-prieties should be previously chosen, if it is done the selecting phase will be easier andefficient. Besides, gathering and knowing what is a good and a bad metric, to select SMis necessary to establish some proprieties.

  • Chapter 3. Related Work 23

    Approach to generate Security Metrics

    To create SM is necessary to follow some guidelines. Two approaches can be adoptedfrom [33]. Even not knowing many organizations already use one of these two. Figure3.1 displays the two approaches described in the article. The first one (Fig. 3.1a) is thetop-down approach, and starts by the information security team defining the objectivesand then goes to select the necessary metrics that would help reach these objectives andfinally find the measurements needed to generate those metrics. The big advantage of thisapproach is that identifying the metrics that matter will take less time.

    The second is the bottom-up approach, as illustrated in Fig. 3.1b, is the opposite. Thesecurity team starts identifying the sources for the measurement, following by generatingthe metrics that are possible with the collected measurement. And lastly, evaluates if thosemetrics would help to the final goals. The advantage for bottom-up approach is the easiestway to obtain the metrics.

    Both approaches are recommended to use when no framework is implemented.

    (a) Top-Down Approach for the SM (b) Bottom-Up Approach for the SM

    Figure 3.1: Two approaches to generate security metrics

    Following or knowing about these steps is essential for the security of information,yet they aren’t strict and can be modified accordingly to be more convenient.

  • Chapter 3. Related Work 24

    3.1.3 Categorization, Classification and Taxonomies

    Taxonomy is a classification scheme and helps in the classification and management of theorganization’s SM. With a well-defined taxonomy, the metrics that have been created willbe more efficient and useful to the organization. If a SM do not fall under the classificationshould be discarded for the simple reason that they are not necessary or will not be useful.If the team thinks that one metric doesn’t fit under the classification of the taxonomy butis important, then the taxonomy should be revised. Taxonomies improve the cooperationwithin the teams, even if they belong to different departments.

    The classification of metrics may vary among organizations, even if they use the samemethodology. Jaquith [24] states that we can use standards as a guide to build frameworks,yet the organizations shouldn’t misuse taxonomies and must create to the organizationsstructure.

    The work in [9] provides twenty metrics definitions specific for business functions.Business functions for (Security) Metrics area is a set of functions in each contains a setof metrics to help fulfill the functions purpose. [9] provides seven business functions andrespective metrics. Table 3.1 (based on the information available by [9] presents the eachbusiness functions and purposes, respectively.

    Function Purpose

    Incident ManagementDetermines how well the organization

    detect, identify, handle and recover fromsecurity incidents

    Vulnerability Management

    Determines how well the organizationmanage its security exposure by

    identifying and mitigating knownvulnerabilities

    Patch ManagementDetermines how well the organization

    are able to maintain the patches state ofits systems

    Configuration ManagementPresents the configuration state of the

    system of the organization

    Change ManagementDetermines how the changes of thesystem configuration can affect the

    security of the organization

    Application SecurityDetermines the reliability on the securitymodel of business applications to operate

    as the organization intended

    Financial MetricsEvaluates the investment made in

    information security

    Table 3.1: Business functions and their purpose - derived from [9]

  • Chapter 3. Related Work 25

    The [9] also categorizes Security Metrics in three hierarchies, based on their purposeand audience. Table 3.2 presents the categories with the functionality and audience.

    Metric Category Functionality Audience

    Management Metrics

    Provide information aboutthe performance businessfunctions and the impact

    on the organization

    Business Management

    Operational MetricsImprove the tasks of

    business functions and abetter understanding

    Security Management

    Technical MetricsProvide technical detailsand can be a support for

    the other metricsSecurity Operation

    Table 3.2: Metrics Categorization - derived from [9]

    IBM also created his own taxonomy, as shown in Fig. 3.2 - [26]. The purpose was tocreate a new classification type of security metrics. This classification – unlike the previ-ous ones – is based on the input data analysed by the SM. The decision to use input dataas the basis of a new classification was made because has a particularly large influence onvalidation, accuracy, and precision for SM.

    Figure 3.2: IBM Taxonomy: Classification of Security Metrics by their Input Types -retrieved from [26]

  • Chapter 3. Related Work 26

    Based on the evaluation of some proposed taxonomies, Savola [36] proposes a high-level information security metrics taxonomy which covers metrics for organization infor-mation security and product development.

    Figure 3.3 and Fig. 3.4 display two examples of the proposed taxonomies. Figure 3.3illustrates a taxonomy for business-level SM with two levels (0 and 1) and Fig.3.4 showsa more detailed taxonomy for SM for information security management with three levels.The number of levels depends on the detailed level the organization wants to work with.

    Figure 3.3: Business-level Security metrics (levels 0 and 1 taxonomy) - retrieved from[36]

    Figure 3.4: Security metrics for information security management in the organization -retrieved from [36]

    3.1.4 Visualization

    SM can provide useful information, yet if we can not interpreted and show its results,the SM can be misinterpreted or too confuse to understand. Explaining and showingthe results of the SM to the C-level managers can be an handicap. The different levelsof technical language domain and the vast quantity of information to present, can be aobstacle and turn into confusion and misleading the interpretation of the results. Jaquith[24] refers these problems and, like Kotenko and Novikova, Payne, and Rathbun, in theirworks [28], [33] and [34], respectively, suggests a way to transform the hard work-data toan elegant and clean way to present to the board.

    Jaquith [24] recognizes that in the ”practical world” the board members prefer qual-itative, ”traffic lights” pie charts methods. But he refutes these methods, due to theirtendency to be graphically inefficient and oversimplify issues too much.

  • Chapter 3. Related Work 27

    He also mentioned some charts/graphics that are a good choice to select, for example:waterfall charts, time series charts, two-by-two matrix, etc. Sometimes qualitative metricsare more important than quantitative metrics. But if qualitative metrics were originatedfrom quantitative metrics, based on the rules of [24] these metrics are not “good” metricsbut can be used to represent the quantitative metrics in a more pleasant view.

    Kotenko and Novikova [28] propose a visualization technique to represent a set ofsecurity metrics. The SM were used to measure the network security status and evaluatethe efficiency of the protection mechanisms. While creating this technique, the goal wasto assist and solve security tasks (given by the SM) which are important to SIEM systems.There are two visual model designs presented in this work that are worth to be mentioned:treemaps and security metrics graphical representation.

    Treemaps

    Treemaps is a technique used to analyse the possible consequences of attacks and coun-termeasures. Using interactive treemaps is possible to represent both a vulnerability re-port and a network security report. Figure 3.5a and Fig. 3.5b display two examples oftreemaps. In these examples the treemaps were used to analyze the network security level.The business values of the host (asset) define the rectangle size, and the correspondingcolour is the result of calculating the host security level or severity of the vulnerability.With treemaps the security team can immediately gather the most important problems asthese maps also help to identify the risk of each sector in the organization.

    (a) Security metrics for information securitymanagement in the organization

    (b) Security metrics for information securitymanagement in the organization

    Figure 3.5: Examples of the technique treemap - retrieved from [28]

    Circle-based Pictogram

    The circle-based Pictogram is a combination of two images to compensate the problemin which the user had to switch between the two treemaps (or other types of graphics) tocompare them. The circle-based pictogram enables the division of N sectors and, thus,

  • Chapter 3. Related Work 28

    provides values of N metrics. The outside ring represents the previous values of the metric(hence is more simple and fast to compare). Figure 3.6 shows a host representation usingthis technique.

    Figure 3.6: Security metrics for information security management in the organization -retrieved from [28]

    Although the authors considered this technique to implement countermeasures, thetechnique can be used in a whole different level. For example, using the developed chart toshow the number of vulnerabilities open, closed, open this month, closed this month, andcompare these numbers with the number from the previous month, for visual standards,is easier than two circle charts. The security team can add outer rings. Each outsiderring represents a previous month. Implementing this technique in the SIEM dashboardcould be complex or even impossible to accomplish, due to the inflexibility of the SIEM.However is possible to implement this feature into the EDP’s external application.

    3.1.5 Metrics used for threat intelligence

    One mechanism of threat intelligence is the internal discovery. The appropriate approachto an organized, accurate, and objective discovery of internal information is the use ofsecurity metrics. Currently there are several articles that provide a set of metrics that canbe used within a SIEM system. In [3] presents a set of security metrics to be appliedwith (or even on, using certain features) SIEM, namely ArcSight. In the same article ispresented the metric: Quiet Feeds, which describes the correct functioning of the sourcesand that counts the number of sources that are not sending data. This metric can be used todiscover internal information to know which sources (logs, antivirus, IPS) attached to theconnectors are not sending information. It can also be used to identify which blacklistsmay not be providing information relevant to the organization.

    A work that uses outside knowledge (without OSINT) and knowledge of the organi-zation is the work of Kotenko et al. [29], which creates models about the impact of athreat on an organization. The authors design a model that identifies which asset (or set ofassets) of the company is subjected to a threat by combining the use of security metrics toidentify vulnerabilities in assets and dependencies between them, and the knowledge of

  • Chapter 3. Related Work 29

    a threat. In the model it is possible to know the risk value for each threat and the surfaceattack which presents how wide an attack can be for an organization. However, this isnot done autonomously, requiring a human investigation into new threats and collectinginformation on the organization’s vulnerabilities.

    3.2 Trustworthy Blacklists

    The organization enhances its security by knowing its own weakness and flaws, howeveris also necessary to know the external threats in order to obtain intelligence about theexistent external risks and, with that, prioritize and implement the required cyber defencemeasures.

    3.2.1 Open Source Intelligence

    The open-source intelligence (OSINT) is the typical concept used for external discovery.Johnson [25] describes the importance of OSINT and the capabilities of the technologiesthat use it. These technologies allow the information security team to intelligently captureand correlate information from the Internet, producing a valuable result for the organiza-tion. Security feeds are a good source to obtain information about external cyberthreatsand with the use of OSINT it is possible to collect pertinent information from severalpublic feeds [5]. A blacklist is an example of a public list which contains informationabout cyberthreats and malicious behaviours.

    One program that can collect multiple blacklists and has the possibility of a specializedconfiguration by the organization is IntelMQ [16]. IntelMQ’s main feature is to collectand process security feeds, such as logs, tweets, or blacklists, in an autonomous manner.This tool enables the information security team to efficiently collect information froma set of feeds. However, if the information we intend to collect is different from thestandard syntax, it is necessary to create modules, or use similar modules, to correctlycollect information from each intended source. After all the configurations and all therequirements are completed to collect the information from the public lists. Is necessaryto gauge to which extent feeds are trustworthy and if indeed it is possible to rely onthem, based on the information obtained to implement defence mechanisms [5], and theIntelMQ does not have that functionality.

    3.2.2 The efficacy and trustworthinesses of Blacklists

    There are some articles that investigate the effectiveness of blacklists and which, in aperiod, provide the most reliable information. Blacklists contain a significant rate of falsepositives [30, 35, 37]. However, it is known that information acquired from a blacklistis a measure widely-used for monitoring and detecting malicious behaviours [30, 37].

  • Chapter 3. Related Work 30

    Sinha et al. [37] analysed four blacklists (NJABL, SORBS, SpamCop and SpamHaus),which report suspicious email addresses considered as spam. It was used an unsolicitedmail detection program for the confirmation and detection of false and true positives.After analysing email traffic in an academic environment (more than 7,000 computers)within 10 days, the results confirmed that blacklists contain a significant number of falsepositives.

    Kuhrer et al. [30] aim to understand the content of the blacklists and how its informa-tion is collected. They present two mechanisms: the detection of parked domains and thedetection of sinkholes. They propose a mechanism to distinguish parked domains frombenign domains, thus reducing a considerable number of non-benign domains present ina blacklist. It is also described a method for the detection of sinkholes, using a techniquedeveloped by the authors (graph-based), and their removal in blacklists. Sinkholes are, forexample, servers that contain malicious domains, but have been controlled and mitigatedby security organizations, which use them to monitor the network and communicationswith malicious domains. The authors conclude that blacklists only contain about 20% ofmalicious domains, resulting in a significant number of false positives.

    In both previous works, it is complicated to state correctly and over time whether theeffectiveness of a blacklist will increase or decrease.

    3.2.3 Blacklists without trustworthiness

    AlienVault’s OTX [2] is a tool similar to the one developed in our work. It gathers infor-mation about IP addresses through reports by a set of communities. After a collection,the threat of the denounced addresses, is assessed considering the number of attacks, thenumber of denunciations and the type of maliciousness to which the suspected IP addressis associated. The result is a list of IPs that can be used for monitoring or blocking IPaddresses with a threat value calculated by OTX. However, the assessment it is only madefor the IPs that are in the OTX and not for the blacklists chosen by the organization’s se-curity team. On the other hand, it does not consider the organization’s cases to revaluatethe reputation value of each IP.

    3.3 Summary of the chapter

    This chapter described works in the fields of SM, OSINT and public blacklist. In thefield of SM exists an am