I made a SHA256SUM-Checker-for-Windows, need help!

This is how my [SHA256SUM-Checker-for-Windows] works.
Step 1: The user needs to download the file he/she wants to verify and the SHA256SUM.txt.
Step 2: The user needs to open SHA256SUM.txt manually.


Step 3: Then user needs to delete unnecessary part by themself.
Step 4: Open hash.exe which is a command line application. Then the user needs to
input the file name.
Step 5: After that the program will generate the hash from the file.
Step 6: Lastly, the user can copy the hash into the input field. The program will compare two hashes for you.
But how can I make the program remove those unnecessary part for me? Thank you!
For more screenshots please visit my repositories.

Step 2: The user needs to open SHA256SUM.txt manually.
the above picture
Step 3: Then user needs to delete unnecessary part by themself.


how can I make the program remove those unnecessary part for me? Thank you!
For more screenshot please visit my repositories.

Here is my code:

Each line in the SHA-256 sums file has the hash, then spaces, then the filename. You can split that using a regular expression with two capturing groups. One thing to consider is how to find the right file(s) in that list. A common approach would be to expect the files in the list exist and have the same name on the local system, and to check them all (nonexistent files get an error message). The pathlib module might be helpful with handling file names.

A few other things I’m thinking of reading that code:

  • When posting code, please use a code block (enclosed in triple backticks, ```). That preserves indentation, which is particularly critical in Python.
  • I strongly caution against renaming the checksum file. Users calling a check tool do not expect it to rename files.
  • You can very much simplify checking the different file names by using a loop over a list of possible filenames, and opening the first one that exists. However it might be better to accept any filename on the command line.
  • You don’t need powershell subprocesses, Python has a very helpful hashlib module. That has the advantage of working on any OS supported by Python. And it’ll be faster too, because it doesn’t need to start extra processes.

Thank you very much.

Thank you for your suggestion!
I made a new one.
Here is the source code.

I’m sorry to say, but that won’t work at all: The code now checks if the hash entered by the user is the same as the last hash in the sha256sum.txt file, the hash of the actual file is ignored completely. And the two additional files are not needed at all, you should use variables to store data you will need later. And the point of reading the checksum file is that you don’t need to ask the user for a hash.

What I meant is to use a regular expression (using Python module re) to split each line in the sha256sum.txt file in its parts: The checksum and the filename. Something like this:

import re
r = re.compile(r'(^[0-9A-Fa-f]+)\s+(\S.*)$')
with open('sha256sum.txt') as fh:
    for line in fh:
        m = r.match(line)
        if m:
            checksum = m.group(1)
            filename = m.group(2)
            # compare here

Then for each filename check if the checksum matches the actual content. No need to ask the user for anything. If there’s anything unclear, feel free to ask! :slightly_smiling_face:

I couldn’t resist completing the example above myself, if you’d like to see the result look here:


It’s pure Python, so it’ll work on any OS/platform supported by Python.

Hi airtower-luna when I executed your code. It got an error. Why is this happened?

line 30, in
with open(sys.argv[1]) as fh:
IndexError: list index out of range

I did some edited on my code according to your example code. Now my program can read the checksum file without asking the user for a hash.

My script expects the name of the checksum file on the command line, instead of requiring it to be named sha256sum.txt. sys.argv (link goes to official Python documentation) is the list of command line arguments, starting with the name of the script itself. If there is no command line argument the list has no element 1, so the access fails.

You could replace the sys.argv[1] with 'sha256sum.txt' to use the same fixed name as in your script, but I prefer the flexibility. People can use different names for their checksum files after all. In a script that’s more than an example I’d check if there is an argument and write an error message saying how you should call the script if not, maybe like this:

if len(sys.argv) < 2:
    print('Missing argument, you need to provide a checksum file!', file=sys.stderr)
    print(f'Usage: {sys.argv[0]} CHECKSUM_FILE', file=sys.stderr)
    sys.exit(1)

with open(sys.argv[1]) as fh:
    # ... (as before)

There are a lot more possible error conditions, e.g the file might not exist or not have the right format.

As for the latest version of your script, the biggest issue is that it creates two new files (sha256sum2.txt and sha256sum3.txt) in the current directory. You don’t need those files, just keep the hashes in variables and compare them directly.

And in situations where you really need a file to store intermediate data, use a temporary file. Python has the tempfile module to make that easy, including automatically deleting them on close().

Hey airtower-luna! Thank you very much.
This is my newest version of my script.

Nice, those extra files are gone now. :slightly_smiling_face:

From a user point of view I’d still wonder why you need to ask for a filename when all the names are right there in the checksum file.

From my technical point of view I’m wondering: Why aren’t you using regular expressions to read the checksum file? :cat:

Thank you :grinning:

Q: From a user point of view I’d still wonder why you need to ask for a filename when all the names are right there in the checksum file.
A: The purpose of asking filename is I want the user have a choice to choose the file they want to check rather than the program get the check file for the user. :grin:

Q: From my technical point of view I’m wondering: Why aren’t you using regular expressions to read the checksum file?
A: First,I haven’t learn how to use regular expressions. I will learn it later. :face_with_hand_over_mouth:
Therefore, I choose another way, by reading the checksum file line by line, then the program can get the filename input from the user.
After this the program will delete the filename, whitespaces and * from the checksum file. Then the program store those information into the variable without editing the originally checksum file. Finally, the program will check and compare the hash for the user.

This is my new SHA256-Checker repository location. Thank you!

Valid! :smile: I’d still recommend learning about them soon, they’re extremely useful for a lot of common programming problems, and this would be a good opportunity.

And I still think it’d be much more user-friendly to just verify all files listed in the checksum file than asking the user to type one of them. :wink: