Cybersecurity_Portfolio

Greetings! My name is Rafael Santamaría Ortega, I'm an aspiring AI Security Engineer commited to ensuring safe and human-centered AI.

View on GitHub

Windows Malware Scanner | Python, Windows, csv, EDR

This algorithm is designed for automating malware scans in Windows enpoint devices, by reading a list of known malware names (specifically .exe files) from a .csv file, compiling these into a malware signature list, and scanning the device’s file system for matching .exe files. Any matches are flagged as potential threats and logged in an output .txt file for further analysis and evidence gathering. To run the algorithm in devices that do not have Python installed, I created an .exe file using PyInstaller that is also stored at it’s GitHub repository.

I am aware that this is a very basic algorithm with 3 main shortcomings:

  1. The unreliability of using file names to detect malware. Furthermore, not every malware is a .exe file, as even fileless malware exists.

  2. The amount of false positives derived from .exe malicious files trying to masquarade as legitimate files.

  3. The lack of independance from the .csv database.

In the future I plan to address each point by either transforming this simple Python code into a Deep Learning model that can “learn” to identify more malware characteristics to predict very accurately if the scanned files of all types are possible malware; even if there is no database present, as the model would already trained and ready to deploy.

Anyway, here is a break down of the main code:

1. Imports and Intialization:

import csv

convlist = list()
unique_exes = set()

2. CSV Parsing and Malware Extraction:

print('Reading and parsing .csv database...')

with open('full.csv', mode='r', encoding='utf-8') as file:
    csvfh = csv.DictReader(file, fieldnames=['F'])
    for row in csvfh:
        exe_position = row['F'].find('.exe')
        data = row['F'][:exe_position + 4]
        if exe_position == -1:
            continue
        else:
            convlist.append(data)
    
    for line in convlist:
        exe = line.rsplit('"', 1)[-1].strip()
        unique_exes.add(exe)

3. Writing Unique Malware Names to a Text File:

print('Threat names identified!')

print('Extracting them in a txt file...')

convlist2 = list(unique_exes)
with open('newlistmalware.txt', 'w', encoding='utf-8') as output_file:
    output_file.write(",".join(convlist2))

print('Results extracted to newlistmalware.txt')

4. Computer Scan for Executables:

print('All done!')

import os

path = 'C:/'
pathfinder = os.fsencode(path)

print('Scanning computer...')

exelist = list()
for dirpath, dirnames, filenames in os.walk(path):
    for fname in filenames:
        if fname.endswith('.exe'):
            exelist.append(fname)

5. Threat Comparison:

print('Computer scan finished!')

print('Comparing results to known malware database...')

threatlist = set()
with open('newlistmalware.txt', 'r', encoding='utf-8') as file:
    for line in file:
        for item in exelist:
            if item in line:
                threatlist.add(item)
                # count += 1

6. Threat Count Display:

count = len(threatlist)
print('Possible threats found:', count)

7. Output of Results:

print('Printing possible threats in a txt file!')

with open('Possiblethreats.txt', 'w', encoding='utf-8') as lst:
    lst.write(f'Total possible threats:{count}\n')
    for exe in threatlist:
        lst.write(exe + '\n')
    # lst.write('\n'.join(threatlist))

print('Possible threats txt printed!')
print('All done!')

8. To investigate if they are actually safe and legitimate:

If you see that there is no signature or that it has a strange name you should probably delete the file, BUT DO BE EXTREMELY CAUTIOUS SINCE YOU CAN ACCIDENTALY DELETE ESSENTIAL PROGRAMS FORM YOUR DEVICE. SO, BEFORE DELETING ANYTHING, YOU SHOULD GOOGLE THE NAMES OF THE FILES AND SIGNATURES TO INVESTIGATE IF THEY ARE FROM MALICIOUS ACTORS OR NOT.

Also, due to the everchanging nature of digital threats, the database on that internet site is constantly being updated (aprox. once every hour), so the data might be a bit out of date. However, there is a way to update the database and get an updated scan:

IMPORTANT: This is not an “antivirus” and it doesn’t replace their use or the use of a VPN or not taking the necessary precautions to secure your device!

back