Windows Malware Scanner | Python, Windows, csv, EDR
This algorithm is designed for automating malware scans in Windows enpoint devices
, by reading a list of known malware names (specifically .exe
files) from a .csv
file, compiling these into a malware signature list, and scanning the device’s file system for matching .exe
files. Any matches are flagged as potential threats and logged in an output .txt
file for further analysis and evidence gathering. To run the algorithm in devices that do not have Python
installed, I created an .exe
file using PyInstaller
that is also stored at it’s GitHub repository.
I am aware that this is a very basic algorithm with 3 main shortcomings:
-
The unreliability of using file names to detect malware. Furthermore, not every malware is a
.exe
file, as even fileless malware exists. -
The amount of false positives derived from
.exe
malicious files trying to masquarade as legitimate files. -
The lack of independance from the
.csv
database.
In the future I plan to address each point by either transforming this simple Python
code into a Deep Learning
model that can “learn” to identify more malware characteristics to predict very accurately if the scanned files of all types are possible malware; even if there is no database present, as the model would already trained and ready to deploy.
Anyway, here is a break down of the main code:
1. Imports and Intialization:
import csv
convlist = list()
unique_exes = set()
- The
csv
Python module is imported to read the.csv
database of malware names. - Two empty data structures are initialized:
convlist
: A list to temporarily store .exe file names found in the.csv
file;unique_exes
: A set to ensure only unique.exe
names are kept.
2. CSV Parsing and Malware Extraction:
print('Reading and parsing .csv database...')
with open('full.csv', mode='r', encoding='utf-8') as file:
csvfh = csv.DictReader(file, fieldnames=['F'])
for row in csvfh:
exe_position = row['F'].find('.exe')
data = row['F'][:exe_position + 4]
if exe_position == -1:
continue
else:
convlist.append(data)
for line in convlist:
exe = line.rsplit('"', 1)[-1].strip()
unique_exes.add(exe)
- Opens and reads
full.csv
, where each row contains potential file paths or names; - Checks each row to see if it contains an
.exe
file. If so, extracts the.exe
filename and appends it toconvlist
; - After reading all rows, iterates through
convlist
, extracts the last section (the actual filename) after the last double quote (“), and adds it tounique_exes
to ensure all entries are unique.
3. Writing Unique Malware Names to a Text File:
print('Threat names identified!')
print('Extracting them in a txt file...')
convlist2 = list(unique_exes)
with open('newlistmalware.txt', 'w', encoding='utf-8') as output_file:
output_file.write(",".join(convlist2))
print('Results extracted to newlistmalware.txt')
- Converts
unique_exes
set to a list (convlist2
) and writes all entries tonewlistmalware.txt
, separated by commas. - This file now contains the known malware names that will be used in the scanning phase.
4. Computer Scan for Executables:
print('All done!')
import os
path = 'C:/'
pathfinder = os.fsencode(path)
print('Scanning computer...')
exelist = list()
for dirpath, dirnames, filenames in os.walk(path):
for fname in filenames:
if fname.endswith('.exe'):
exelist.append(fname)
- Sets the scanning root directory as
C:/
and usesos.walk
to iterate through all directories and files. - Appends each
.exe
file found toexelist
. This list represents all executables on the local machine that will be compared with known malware.
5. Threat Comparison:
print('Computer scan finished!')
print('Comparing results to known malware database...')
threatlist = set()
with open('newlistmalware.txt', 'r', encoding='utf-8') as file:
for line in file:
for item in exelist:
if item in line:
threatlist.add(item)
# count += 1
- Reads
newlistmalware.txt
, which contains known malware names. - For each executable in
exelist
, checks if it is mentioned in the known malware list (newlistmalware.txt
). If it matches, it is added tothreatlist
.
6. Threat Count Display:
count = len(threatlist)
print('Possible threats found:', count)
- Counts the number of unique matches (potential threats) in
threatlist
and displays the count to the user.
7. Output of Results:
print('Printing possible threats in a txt file!')
with open('Possiblethreats.txt', 'w', encoding='utf-8') as lst:
lst.write(f'Total possible threats:{count}\n')
for exe in threatlist:
lst.write(exe + '\n')
# lst.write('\n'.join(threatlist))
print('Possible threats txt printed!')
print('All done!')
- Writes the total threat count and list of flagged executables to
Possiblethreats.txt
, creating a record of all identified threats. - Signals completion to the user.
8. To investigate if they are actually safe and legitimate:
- Got to
C:\
in your file explorer - Type in the searchbar the name of the
.exe
file (It may take a bit) - When found, check it’s digital signature by right clicking them and searching for it in “properties”.
If you see that there is no signature or that it has a strange name you should probably delete the file, BUT DO BE EXTREMELY CAUTIOUS SINCE YOU CAN ACCIDENTALY DELETE ESSENTIAL PROGRAMS FORM YOUR DEVICE. SO, BEFORE DELETING ANYTHING, YOU SHOULD GOOGLE THE NAMES OF THE FILES AND SIGNATURES TO INVESTIGATE IF THEY ARE FROM MALICIOUS ACTORS OR NOT.
Also, due to the everchanging nature of digital threats, the database on that internet site is constantly being updated (aprox. once every hour), so the data might be a bit out of date. However, there is a way to update the database and get an updated scan:
- Go to https://bazaar.abuse.ch/export/
- Export most recent
.csv
file - Put
.csv
file in the “materials” folder - Run the file
fullread.py
in that same folder to parse the database and extract the.exe
files in the document, using the terminal form your device (assuming you have also downloadedPython
in your computer) - It should produce or update a txt document with a list of the
.exe
threats identified in the.csv
file inside that same folder - Copy or move the
txt
file to the folder “ScannerDownLoad” - Run the file “scannermalware” inside that folder using the terminal form your device
- It should produce or update (if you have already run the scanner) a txt document called “Possiblethreats.txt” with the list of suspicious .exe programs
IMPORTANT: This is not an “antivirus” and it doesn’t replace their use or the use of a VPN or not taking the necessary precautions to secure your device!