Question about hashes for files

vei9

New Member
When you go to a website to download something, you sometimes see a hash for that file, usually in MD5 or SHA.

You then cross reference this hash with the hash you generate yourself when you run the file through something like CrypTool, to make sure the file is genuine...

But I do not understand the point of this exercise. Because if an attacker is going to modify the file for purposes of an attack, then wouldn't they also presumably have access to the HTML page where the file is hosted, and they can then modify the hash value?

I am confused...
 
The point is to verify if the file was downloaded correctly with no errors.
Let's just say you spent a few hours downloading an ISO from Microshaft, you would want to run the hash check to make sure the file is intact before you use it.
Yes, the hash can be manipulated by someone with ill intent but that is the chance you take when you download from less than stellar sites.

BTW, I use HashTab, works good for me and it's free.
 
Last edited:
When you go to a website to download something, you sometimes see a hash for that file, usually in MD5 or SHA.

You then cross reference this hash with the hash you generate yourself when you run the file through something like CrypTool, to make sure the file is genuine...

But I do not understand the point of this exercise. Because if an attacker is going to modify the file for purposes of an attack, then wouldn't they also presumably have access to the HTML page where the file is hosted, and they can then modify the hash value?

I am confused...
No.
More often then not, download mirrors are running on a complete separate instance then the website would be. Also, you are assuming this file is only on one mirror, a lot of files are hosted on multiple mirrors.

Lastly, you're missing the main reason for these sums. Security is part of it yes, but verifying the file is the main reason sums are listed. It's the easiest and quickest way to verify you have all of the file downloaded.
 
I want to take this chance to ask a further question

Is it possible to ensure the file downloaded is the exact copy as the server one while MD5 or any other checksum values are not provided

If a firefox or chrome or IDM completes the download without error/failure message, does it imply that the files downloaded are bit-perfect with identical checksums?

Thanks
 
I want to take this chance to ask a further question

Is it possible to ensure the file downloaded is the exact copy as the server one while MD5 or any other checksum values are not provided
No. Without a sum to check against it's pointless.

If a firefox or chrome or IDM completes the download without error/failure message, does it imply that the files downloaded are bit-perfect with identical checksums?
No. See above.
Thanks

See bold.
 
See bold.

For the second question, I have a further to ask

If the browsers or IDM returns with incomplete message , it is indeed that the file is incomplete, but while a browser says it's complete or IDM states it is 100% downloaded, why we can't conclude that the newly downloaded file is identical to the one on the remote server?

Doesn't "complete downloaded" or "100% downloaded" mean "bit-perfect", "flawless" or "exact checksum" from the one on the remote server? I am confused.

I don't think the engineer, programmer of Firefox or Chrome browsers don't know about this problem and never think about a solution after more than 20 years of browsers history.

After all, what I want to concern most is my file

I hope to know, if, without MD5 checksum provided, and while 100% complete doesn't mean an intact copy, what practices can I do in order to reduce the probability of getting imperfect files

Thank you
 
For the second question, I have a further to ask

If the browsers or IDM returns with incomplete message , it is indeed that the file is incomplete, but while a browser says it's complete or IDM states it is 100% downloaded, why we can't conclude that the newly downloaded file is identical to the one on the remote server?

Doesn't "complete downloaded" or "100% downloaded" mean "bit-perfect", "flawless" or "exact checksum" from the one on the remote server? I am confused.

I don't think the engineer, programmer of Firefox or Chrome browsers don't know about this problem and never think about a solution after more than 20 years of browsers history.

After all, what I want to concern most is my file

I hope to know, if, without MD5 checksum provided, and while 100% complete doesn't mean an intact copy, what practices can I do in order to reduce the probability of getting imperfect files

Thank you

I do not quite know how this works, but when it says 100%, it just means that you have completed downloading the packets that you have requested from the server. If I sent you three parcels, but someone tampered with one, you would still receive three. If you were not aware that one had been tampered with, you would think that the three parcels are exactly what I sent to you.

That is how I see it anyway :P
 
I do not quite know how this works, but when it says 100%, it just means that you have completed downloading the packets that you have requested from the server. If I sent you three parcels, but someone tampered with one, you would still receive three. If you were not aware that one had been tampered with, you would think that the three parcels are exactly what I sent to you.

That is how I see it anyway :P

I just don't understand lots of articles say that MD5 (or other checksums) could be used to check if a completely downloaded file is intact compared to the server one. If a file is downloaded "completely" by a browser, then it is indeed intact. Doesnt "complete" mean "intact"? I think, logically, there are two things - Complete or Incomplete, Yes or No, there are no third choices.

I just don't understand why comparing MD5 of a complete downloaded is necssary.

If a download is "complete", then it is complete
if a download is "incomplete", then we know that from the browser message, we 'll redownload it.

It's like being meaningless to compare the checksums in either case.
 
The server can send the entire file but a packet or 2 may get lost along the way.
The server will not know that and neither will your comp..
 
I do not quite know how this works, but when it says 100%, it just means that you have completed downloading the packets that you have requested from the server. If I sent you three parcels, but someone tampered with one, you would still receive three. If you were not aware that one had been tampered with, you would think that the three parcels are exactly what I sent to you.

That is how I see it anyway :P

I just don't understand lots of articles say that MD5 (or other checksums) could be used to check if a completely downloaded file is intact compared to the server one. If a file is downloaded "completely" by a browser, then it is indeed intact. Doesnt "complete" mean "intact"? I think, logically, there are two things - Complete or Incomplete, Yes or No, there are no third choices.
 
The server can send the entire file but a packet or 2 may get lost along the way.
The server will not know that and neither will your comp..

is it what so called BER (bit error rate)?
how to avoid or reduce the chance of such situation if no checksum is provided by the original uploaders?
 
You have to understand, these packets are sent via UDP. UDP does not care if you got the packet or not. You request the download, and it's sent to you via UDP. If you get the packet, cool, if not, oh well, no one will ever know. UDP does NOT send an acknowledgement back to the server. There is NOT a three way handshake like there is in TCP.

Basically, the server sends a bunch of packets and your browser or whatever attempts to get all of them. In most cases this is fine, as media can lose a packet here or there and still work fine without you knowing. However, things like operating systems are a lot more vital to verify. So, when you verify a file, you are verifying that down to every single bit that it is identical to the one on the server you got it from.

You asked if people are aware of this problem, why is nothing done? Because as I've said, it doesn't matter in most cases. Most large files (where this occurs the most) that can't take a loss in packets normally have a checksum you can check against.

As far as the browser saying it's "complete" that's because it assumes it is. The browser gets the last packet in the stream and the closes it. Again, it has no way of verifying the number of packets sent by the server. It just has a count of what it got, not what should actually be there.
 
Last edited:
You have to understand, these packets are sent via UDP. UDP does not care if you got the packet or not. You request the download, and it's sent to you via UDP. If you get the packet, cool, if not, oh well, no one will ever know. UDP does NOT send an acknowledgement back to the server. There is NOT a three way handshake like there is in TCP.

Basically, the server sends a bunch of packets and your browser or whatever attempts to get all of them. In most cases this is fine, as media can lose a packet here or there and still work fine without you knowing. However, things like operating systems are a lot more vital to verify. So, when you verify a file, you are verifying that down to every single bit that it is identical to the one on the server you got it from.

You asked if people are aware of this problem, why is nothing done? Because as I've said, it doesn't matter in most cases. Most large files (where this occurs the most) that can't take a loss in packets normally have a checksum you can check against.

As far as the browser saying it's "complete" that's because it assumes it is. The browser gets the last packet in the stream and the closes it. Again, it has no way of verifying the number of packets sent by the server. It just has a count of what it got, not what should actually be there.


Thank you

Your explanation is very clear. You are the first one who really knows what I am asking about in many forums. Thanks you again.

But

I know the only way to ensure is by comparing provided checksum with the server. But assume that if checksum is not provided by the owner, what can I do to reduce the chance of getting imperfect files?

I only say, reduce, I don't mean to hundred percent ensure it does'nt happen.

A better lan cable? A faster computer? A better harddrives? Would these matter much?

I mean to reduce the chance, but not hundred percent ensure that. Thanks
 
The main thing is a good connection. Without that, you are asking for trouble, especially if your network has a lot of hiccups/goes down a lot.

If you do have a good connection, then there really isn't much else too worry about. Your chance of losing packets is signifigantly lower, so I wouldn't invest too much more time in thinking about it. Any other questions?
 
Yea, HTTP downloads use TCP which has error correction. In general, transmission errors of individual packets are detected and the client requests the server to resend, so most errors are caught. But the error-detection only applies to individual packets - the TCP protocol has no idea whether or not the packets contain parts of a file or not. Also, the TCP checksum isn't particularly strong - it works fine, but occasionally an error slips by, resulting in a corrupt download.

I know the only way to ensure is by comparing provided checksum with the server. But assume that if checksum is not provided by the owner, what can I do to reduce the chance of getting imperfect files?
Over http, there isn't much you can do - though for smaller files plain http downloads are fine. For bigger files, if you have the option, you should download over a protocol that's designed to catch errors reliably and can re-download the corrupt portions to fix it.
 
The main thing is a good connection. Without that, you are asking for trouble, especially if your network has a lot of hiccups/goes down a lot.

If you do have a good connection, then there really isn't much else too worry about. Your chance of losing packets is signifigantly lower, so I wouldn't invest too much more time in thinking about it. Any other questions?

hello but how to know if a connection is good while nowadays we are everybody using high speed stable broadband connection, it is not the time when we use modem and got lots of disconnection, then how to know if a connection is good or not. Do you mean a "good" connection is a "fast" connection or do you mean a good connection is a connection that looks fine, then it is a good connection? Thanks
 
Last edited:
hello but how to know if a connection is good while nowadays we are everybody using high speed stable broadband connection, it is not the time when we use modem and got lots of disconnection, then how to know if a connection is good or not. Do you mean a "good" connection is a "fast" connection or do you mean a good connection is a connection that looks fine, then it is a good connection? Thanks

Make sure you have Java installed, and go to pingtest.net and run a test. It will show you your packet loss, jitter and ping.
 
Back
Top