mirror of
https://github.com/caperren/school_archives.git
synced 2025-11-09 21:51:15 +00:00
Added missing classes from final year at OSU
This commit is contained in:
@@ -0,0 +1,37 @@
|
||||
##### To Run This Program #####
|
||||
Run one of the lines below replacing arguments as necessary:
|
||||
|
||||
./bloom_filter.py -d dictionary.txt -i input.txt -o output3.txt output5.txt
|
||||
OR
|
||||
python3 bloom_filter.py -d dictionary.txt -i input.txt -o output3.txt output5.txt
|
||||
|
||||
Both of the above commands have been tested on this program on the OSU Flip servers.
|
||||
There is no guarantee that this will run on any computers other than these servers!!!
|
||||
No makefile is needed for this program.
|
||||
|
||||
##### Answers to Questions #####
|
||||
a.
|
||||
The functions I chose were ripemd160, sha256, whirlpool, md5, and DSA. These are all cryptographic hashes. I chose
|
||||
them because they are less likely to generate collisions than non-cryptographic ones (at the expense of being slower),
|
||||
as well as because they are the ones build into the hashlib library for python3 on the flip servers, which guarantees
|
||||
that the grader will be able to run the program without having to install or include additional libraries.
|
||||
|
||||
b.
|
||||
ripemd160: 0.000006914
|
||||
sha256: 0.00001025
|
||||
whirlpool: 0.00001121
|
||||
md5: 0.00001025
|
||||
DSA: 0.000009298
|
||||
|
||||
ripemd160 and DSA are the fastest, though not by much. They perform better as their algorithms to generate the hash
|
||||
computer more quickly than the others. It is also likely that the length of the hash output is shorter than ones like
|
||||
sha256, which are quite large.
|
||||
|
||||
c.
|
||||
The probability of false positives is 1% as I set the hash bit array size to 5976456, which I calculated using a
|
||||
dictionary size of 623518. The result of the false positive equation -((623518 * log(0.01)/(log^2(2))) = 5976456.
|
||||
The probability of false negatives is 0%. It is not possible to have a false negative with a bloom filter.
|
||||
|
||||
d.
|
||||
The rate of false positives can be reduced by increasing the number of storable positions for the the hash bit array, or
|
||||
by reducing the number of hash functions used to reduce collisions.
|
||||
Reference in New Issue
Block a user