Windows Malware Scanner | Python, Windows, csv, EDR
This algorithm is designed for automating malware scans in Windows enpoint devices, by reading a list of known malware names (specifically .exe files) from a .csv file, compiling these into a malware signature list, and scanning the device’s file system for matching .exe files. Any matches are flagged as potential threats and logged in an output .txt file for further analysis and evidence gathering. To run the algorithm in devices that do not have Python installed, I created an .exe file using PyInstaller that is also stored at it’s GitHub repository.
I am aware that this is a very basic algorithm with 3 main shortcomings:
-
The unreliability of using file names to detect malware. Furthermore, not every malware is a
.exefile, as even fileless malware exists. -
The amount of false positives derived from
.exemalicious files trying to masquarade as legitimate files. -
The lack of independance from the
.csvdatabase.
In the future I plan to address each point by either transforming this simple Python code into a Deep Learning model that can “learn” to identify more malware characteristics to predict very accurately if the scanned files of all types are possible malware; even if there is no database present, as the model would already trained and ready to deploy.
Anyway, here is a break down of the main code:
1. Imports and Intialization:
import csv
convlist = list()
unique_exes = set()
- The
csvPython module is imported to read the.csvdatabase of malware names. - Two empty data structures are initialized:
convlist: A list to temporarily store .exe file names found in the.csvfile;unique_exes: A set to ensure only unique.exenames are kept.
2. CSV Parsing and Malware Extraction:
print('Reading and parsing .csv database...')
with open('full.csv', mode='r', encoding='utf-8') as file:
csvfh = csv.DictReader(file, fieldnames=['F'])
for row in csvfh:
exe_position = row['F'].find('.exe')
data = row['F'][:exe_position + 4]
if exe_position == -1:
continue
else:
convlist.append(data)
for line in convlist:
exe = line.rsplit('"', 1)[-1].strip()
unique_exes.add(exe)
- Opens and reads
full.csv, where each row contains potential file paths or names; - Checks each row to see if it contains an
.exefile. If so, extracts the.exefilename and appends it toconvlist; - After reading all rows, iterates through
convlist, extracts the last section (the actual filename) after the last double quote (“), and adds it tounique_exesto ensure all entries are unique.
3. Writing Unique Malware Names to a Text File:
print('Threat names identified!')
print('Extracting them in a txt file...')
convlist2 = list(unique_exes)
with open('newlistmalware.txt', 'w', encoding='utf-8') as output_file:
output_file.write(",".join(convlist2))
print('Results extracted to newlistmalware.txt')
- Converts
unique_exesset to a list (convlist2) and writes all entries tonewlistmalware.txt, separated by commas. - This file now contains the known malware names that will be used in the scanning phase.
4. Computer Scan for Executables:
print('All done!')
import os
path = 'C:/'
pathfinder = os.fsencode(path)
print('Scanning computer...')
exelist = list()
for dirpath, dirnames, filenames in os.walk(path):
for fname in filenames:
if fname.endswith('.exe'):
exelist.append(fname)
- Sets the scanning root directory as
C:/and usesos.walkto iterate through all directories and files. - Appends each
.exefile found toexelist. This list represents all executables on the local machine that will be compared with known malware.
5. Threat Comparison:
print('Computer scan finished!')
print('Comparing results to known malware database...')
threatlist = set()
with open('newlistmalware.txt', 'r', encoding='utf-8') as file:
for line in file:
for item in exelist:
if item in line:
threatlist.add(item)
# count += 1
- Reads
newlistmalware.txt, which contains known malware names. - For each executable in
exelist, checks if it is mentioned in the known malware list (newlistmalware.txt). If it matches, it is added tothreatlist.
6. Threat Count Display:
count = len(threatlist)
print('Possible threats found:', count)
- Counts the number of unique matches (potential threats) in
threatlistand displays the count to the user.
7. Output of Results:
print('Printing possible threats in a txt file!')
with open('Possiblethreats.txt', 'w', encoding='utf-8') as lst:
lst.write(f'Total possible threats:{count}\n')
for exe in threatlist:
lst.write(exe + '\n')
# lst.write('\n'.join(threatlist))
print('Possible threats txt printed!')
print('All done!')
- Writes the total threat count and list of flagged executables to
Possiblethreats.txt, creating a record of all identified threats. - Signals completion to the user.
8. To investigate if they are actually safe and legitimate:
- Got to
C:\in your file explorer - Type in the searchbar the name of the
.exefile (It may take a bit) - When found, check it’s digital signature by right clicking them and searching for it in “properties”.
If you see that there is no signature or that it has a strange name you should probably delete the file, BUT DO BE EXTREMELY CAUTIOUS SINCE YOU CAN ACCIDENTALY DELETE ESSENTIAL PROGRAMS FORM YOUR DEVICE. SO, BEFORE DELETING ANYTHING, YOU SHOULD GOOGLE THE NAMES OF THE FILES AND SIGNATURES TO INVESTIGATE IF THEY ARE FROM MALICIOUS ACTORS OR NOT.
Also, due to the everchanging nature of digital threats, the database on that internet site is constantly being updated (aprox. once every hour), so the data might be a bit out of date. However, there is a way to update the database and get an updated scan:
- Go to https://bazaar.abuse.ch/export/
- Export most recent
.csvfile - Put
.csvfile in the “materials” folder - Run the file
fullread.pyin that same folder to parse the database and extract the.exefiles in the document, using the terminal form your device (assuming you have also downloadedPythonin your computer) - It should produce or update a txt document with a list of the
.exethreats identified in the.csvfile inside that same folder - Copy or move the
txtfile to the folder “ScannerDownLoad” - Run the file “scannermalware” inside that folder using the terminal form your device
- It should produce or update (if you have already run the scanner) a txt document called “Possiblethreats.txt” with the list of suspicious .exe programs
IMPORTANT: This is not an “antivirus” and it doesn’t replace their use or the use of a VPN or not taking the necessary precautions to secure your device!