{"id":266,"date":"2023-09-24T10:19:24","date_gmt":"2023-09-24T17:19:24","guid":{"rendered":"http:\/\/improdango.com\/?page_id=266"},"modified":"2023-10-20T23:44:16","modified_gmt":"2023-10-21T06:44:16","slug":"data-sources","status":"publish","type":"page","link":"http:\/\/improdango.com\/?page_id=266","title":{"rendered":"Data Sources"},"content":{"rendered":"\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-28f84493 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:25%\">\n<p><a href=\"http:\/\/improdango.com\/?page_id=12\" data-type=\"page\" data-id=\"12\">&#x2196; Intelligence Buy-In<\/a><\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:50%\"><\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:25%\">\n<p class=\"has-text-align-center has-extra-small-font-size\"><a href=\"http:\/\/improdango.com\/?page_id=268\" data-type=\"page\" data-id=\"268\">Data &#x2197;<\/a><\/p>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-media-text alignwide is-stacked-on-mobile\" style=\"grid-template-columns:25% auto\"><figure class=\"wp-block-media-text__media\"><img loading=\"lazy\" decoding=\"async\" width=\"200\" height=\"200\" src=\"http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/DFIR_Report.jpg\" alt=\"\" class=\"wp-image-279 size-full\" srcset=\"http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/DFIR_Report.jpg 200w, http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/DFIR_Report-150x150.jpg 150w\" sizes=\"auto, (max-width: 200px) 100vw, 200px\" \/><\/figure><div class=\"wp-block-media-text__content\">\n<p><strong>DFIR Report: Digital Forensics and Incident Response Report<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>A company dedicated to provide real intrusion by providing reports, analysis, and services to experts.<\/li>\n\n\n\n<li>Data is generated by members and monitored\/compiled by a team who held CTO positions and\/or focus on information security.<\/li>\n\n\n\n<li>Site contains many reports. One of the reports is exclusively for ransomware.<\/li>\n\n\n\n<li>Ransomware encrypts company files\u00a0bringing bank services to a halt as demonstrated in our first diamond model.\u00a0 DFIR Report provides the appropriate data to mine for our research.\u00a0 This dataset contains a collection of malware files that are used for ransomware which can be used in combination with emails to identify if an email is malicious or not.\u00a0<\/li>\n<\/ul>\n<\/div><\/div>\n\n\n\n<div class=\"wp-block-media-text alignwide is-stacked-on-mobile\" style=\"grid-template-columns:24% auto\"><figure class=\"wp-block-media-text__media\"><img loading=\"lazy\" decoding=\"async\" width=\"400\" height=\"400\" src=\"http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/CanadianInstituteForCybersecurity.jpg\" alt=\"\" class=\"wp-image-286 size-full\" srcset=\"http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/CanadianInstituteForCybersecurity.jpg 400w, http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/CanadianInstituteForCybersecurity-300x300.jpg 300w, http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/CanadianInstituteForCybersecurity-150x150.jpg 150w\" sizes=\"auto, (max-width: 400px) 100vw, 400px\" \/><\/figure><div class=\"wp-block-media-text__content\">\n<p><strong>Intrusion&nbsp;Detection&nbsp;Evaluation Dataset&nbsp;(CIC-IDS2017)<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Access through Kaggle and Canadian Institute for Cybersecurity<\/li>\n\n\n\n<li>This dataset provides up to date attacks with realistic background traffic for network attack analysis.&nbsp; The dataset was built using 25 profiles of typical human behavior on a network based on HTTP, HTTPS, FTP, SSH, and email protocols.<\/li>\n\n\n\n<li>The dataset is helpful for financial institutions to simulate network intrusion and test AI models to reduce false alarms and increase the portion of fraudulent cases detected.\u00a0<\/li>\n\n\n\n<li>Data was generated by researchers to use for machine and deep learning cybersecurity models.<\/li>\n\n\n\n<li>The dataset being used: \u201cThursday-WorkingHours-Morning-WebAttacks.pcap_ISCX\u201d<\/li>\n<\/ul>\n<\/div><\/div>\n\n\n\n<div class=\"wp-block-media-text alignwide is-stacked-on-mobile\" style=\"grid-template-columns:25% auto\"><figure class=\"wp-block-media-text__media\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"465\" src=\"http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/kaggle-logo-transparent-300-1024x465.png\" alt=\"\" class=\"wp-image-289 size-full\" srcset=\"http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/kaggle-logo-transparent-300-1024x465.png 1024w, http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/kaggle-logo-transparent-300-300x136.png 300w, http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/kaggle-logo-transparent-300-768x349.png 768w, http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/kaggle-logo-transparent-300.png 1056w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure><div class=\"wp-block-media-text__content\">\n<p><strong>Phishing Email Detection Dataset<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>This dataset on Kaggle is meant as a training device for machine learning tools to determine which emails are potential phishing emails based on the content of the body of the email.<\/li>\n\n\n\n<li>The dataset is three months old, so is relevant and not obsolete.<\/li>\n\n\n\n<li>CSV file contains over 28,000 rows of email body text, and classification if the email is a phishing or safe email. Dataset is important and relevant because phishing appears in our diamond models, and can serve as an entryway for attackers to launch other attacks.<\/li>\n<\/ul>\n<\/div><\/div>\n\n\n\n<div class=\"wp-block-media-text alignwide is-stacked-on-mobile\" style=\"grid-template-columns:25% auto\"><figure class=\"wp-block-media-text__media\"><img loading=\"lazy\" decoding=\"async\" width=\"317\" height=\"129\" src=\"http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/StratosphereLab.png\" alt=\"\" class=\"wp-image-281 size-full\" srcset=\"http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/StratosphereLab.png 317w, http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/StratosphereLab-300x122.png 300w\" sizes=\"auto, (max-width: 317px) 100vw, 317px\" \/><\/figure><div class=\"wp-block-media-text__content\">\n<p><strong>CTU Mixed Capture 5 Dataset<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Stratosphere Laboratory is an organization which originated at the Czech Technical University of Prague, as a continuation of the work of a PhD student.&nbsp; Stratosphere IPS offers a free machine learning based Intrusion prevention system, along with other ongoing projects, and publicly available datasets.<\/li>\n\n\n\n<li>This dataset is 173MB and contains packet capture data from a simulated malware attack.<\/li>\n\n\n\n<li>Data is from 2015 Valuable due to its complexity and comprehensiveness.&nbsp; This dataset should provide a valuable snapshot of a realistic malware use case.<\/li>\n<\/ul>\n<\/div><\/div>\n\n\n\n<div class=\"wp-block-media-text alignwide is-stacked-on-mobile\" style=\"grid-template-columns:25% auto\"><figure class=\"wp-block-media-text__media\"><img loading=\"lazy\" decoding=\"async\" width=\"230\" height=\"225\" src=\"http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/github-mark-white.png\" alt=\"\" class=\"wp-image-287 size-full\"\/><\/figure><div class=\"wp-block-media-text__content\">\n<p><strong>JavaScript Vulnerability Dataset<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Accessed through GitHub<\/li>\n\n\n\n<li>The JavaScript Vulnerability dataset\u00a0contains vulnerability\u00a0information in public databases of the Node Security\u00a0Project and the\u00a0Snyk\u00a0platform.\u00a0 (12,126 rows)<\/li>\n\n\n\n<li>JavaScript is gaining popularity as a\u00a0 programming language for server-side web application, mobile app\u00a0 and IoT implementation.<\/li>\n\n\n\n<li>The wide scale adoption of third-party\u00a0packages by code developers such as those stored by the Node Package Manager (npm) increases Javascript\u00a0vulnerabilities.\u00a0<\/li>\n\n\n\n<li>The dataset can be used for building prediction models to determine whether Javascript functions and the associated static\u00a0source code metrics contain vulnerability or not.<\/li>\n\n\n\n<li>This can benefit our work since JavaScript based IoT devices were included amongst our diamond models.<\/li>\n<\/ul>\n<\/div><\/div>\n\n\n\n<div class=\"wp-block-media-text alignwide is-stacked-on-mobile\" style=\"grid-template-columns:25% auto\"><figure class=\"wp-block-media-text__media\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"465\" src=\"http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/kaggle-logo-transparent-300-1024x465.png\" alt=\"\" class=\"wp-image-289 size-full\" srcset=\"http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/kaggle-logo-transparent-300-1024x465.png 1024w, http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/kaggle-logo-transparent-300-300x136.png 300w, http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/kaggle-logo-transparent-300-768x349.png 768w, http:\/\/improdango.com\/wp-content\/uploads\/2023\/09\/kaggle-logo-transparent-300.png 1056w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure><div class=\"wp-block-media-text__content\">\n<p><strong>Malicious URLs Dataset<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>This dataset on Kaggle is meant as a training device to develop\u00a0machine learning models that help determine which URLs have malicious content.<\/li>\n\n\n\n<li>This is a large CSV file with a dataset of 65,1191 URLs of which 42,8103 URLs are considered safe.<\/li>\n\n\n\n<li>Malicious URLs can host unsolicited\u00a0threats that can lead to\u00a0malware installation, phishing and theft of private banking information.<\/li>\n\n\n\n<li>This dataset is important and relevant because\u00a0malware installation, phishing and theft of private banking\u00a0information all appear in our diamond models and can provide fertile grounds for adversarial attacks<\/li>\n<\/ul>\n<\/div><\/div>\n","protected":false},"excerpt":{"rendered":"<p>&#x2196; Intelligence Buy-In Data &#x2197;<\/p>\n","protected":false},"author":3,"featured_media":0,"parent":0,"menu_order":3,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-266","page","type-page","status-publish","hentry"],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"http:\/\/improdango.com\/index.php?rest_route=\/wp\/v2\/pages\/266","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/improdango.com\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"http:\/\/improdango.com\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"http:\/\/improdango.com\/index.php?rest_route=\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"http:\/\/improdango.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=266"}],"version-history":[{"count":10,"href":"http:\/\/improdango.com\/index.php?rest_route=\/wp\/v2\/pages\/266\/revisions"}],"predecessor-version":[{"id":442,"href":"http:\/\/improdango.com\/index.php?rest_route=\/wp\/v2\/pages\/266\/revisions\/442"}],"wp:attachment":[{"href":"http:\/\/improdango.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=266"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}