#PrisonProfessors #Binance #Cryptocurrency #Blockchain #Education #Rehabilitation #DigitalLiteracy #SecondChances #TransformingLives
Lesson 2: What is Hashing?
Lesson Intro:
The world’s largest crypto exchange, Binance, entered into a collaboration agreement with our nonprofit, the Prison Professors Charitable Corporation. With this collaboration, we’re able to provide justice-impacted people with a great resource they can use to learn about cryptocurrency, decentralized finance (DeFi), Web3.0, Artificial Intelligence, and other topics that relate to the digital economy. The agreement is part of our nonprofit’s ongoing efforts to help people prepare for success after prison.
Some people have access to our videos, others do not. If a staff member in your prison will authorize videos, invite them to contact Prison Professors for information on how we can send DVDs with additional educational materials to help people learn–and potentially an in-person presentation in your facility.
Each lesson includes critical thinking questions and a glossary. We encourage participants to use these lessons, and to memorialize their learning path by building a profile on Prison Professors Talent. More information on how to build a personal profile at the end of this workbook.
2. What Is Hashing?
TL;DR:
Hashing: The process of converting variable-size input data into fixed-size output (hash) using mathematical formulas called hash functions.
Key Points
Deterministic: Same input always produces the same output.
One-way: Easy to generate the output from the input, but difficult to reverse.
Fixed Output Size: Each algorithm produces outputs of a fixed size (e.g., SHA-256 always gives 256-bit output).
Cryptographic Hash Functions:
Core to blockchain and cryptocurrency for data integrity and security.
Collision Resistance: Hard to find two different inputs that produce the same hash.
Preimage Resistance: Hard to find the original input given its hash.
Second-Preimage Resistance: Hard to find another input that produces the same hash as a given input.
Example Algorithms:
SHA-256: Produces 256-bit hashes (used in Bitcoin).
SHA-1: Produces 160-bit hashes (less secure due to collisions).
Bitcoin Mining:
Miners perform hashing operations to find a valid block hash that meets difficulty criteria (e.g., a certain number of leading zeros).
Difficulty adjusts with network hash rate to maintain block time around 10 minutes.
Importance:
Ensures data integrity and authenticity.
Vital for securing blockchain transactions and operations.
Applications:
Database management, digital fingerprinting, message authentication, password security. Understanding these properties is crucial for grasping the fundamentals of blockchain technology.
Hashing refers to the process of generating a fixed-size output from an input of variable size. This is done through the use of mathematical formulas known as hash functions (implemented as hashing algorithms).
Although not all hash functions involve the use of cryptography, the so-called cryptographic hash functions are at the core of cryptocurrencies. Thanks to them, blockchains and other distributed systems are able to achieve significant levels of data integrity and security.
Both conventional and cryptographic hash functions are deterministic. Being deterministic means that as long as the input doesn’t change, the hashing algorithm will always produce the same output (also known as digest or hash).
Typically, the hashing algorithms of cryptocurrencies are designed as one-way functions, meaning they cannot be easily reverted without large amounts of computing time and resources. In other words, it is quite easy to create the output from the input, but relatively difficult to go in the opposite direction (to generate the input from the output alone). Generally speaking, the more difficult it is to find the input, the more secure the hashing algorithm is considered to be.
How does a hash function work?
Different hash functions will produce outputs of differing sizes, but the possible output sizes for each hashing algorithm is always constant. For instance, the SHA-256 algorithm can only produce outputs of 256 bits, while the SHA-1 will always generate a 160-bits digest.
To illustrate, let’s run the words “Binance” and “binance” through the SHA-256 hashing algorithm (the one used in Bitcoin).
SHA-256 | |
Input | Output (256 bits) |
Binance | f1624fcc63b615ac0e95daf9ab78434ec2e8ffe402144dc631b055f711225191 |
binance | 59bba357145ca539dcd1ac957abc1ec5833319ddcae7f5e8b5da0c36624784b2 |
Note that a minor change (the casing of the first letter) resulted in a totally different hash value. But since we are using SHA-256, the outputs will always have a fixed size of 256-bits (or 64 characters) – regardless of the input size. Also, it doesn’t matter how many times we run the two words through the algorithm, the two outputs will remain constant.
Conversely, if we run the same inputs through the SHA-1 hashing algorithm, we would have the following results:
SHA-1 | |
Input | Output (160 bits) |
Binance | 7f0dc9146570c608ac9d6e0d11f8d409a1ee6ed1 |
binance | e58605c14a76ff98679322cca0eae7b3c4e08936 |
Notably, the acronym SHA stands for Secure Hash Algorithms. It refers to a set of cryptographic hash functions that include the SHA-0 and SHA-1 algorithms along with the SHA-2 and SHA-3 groups. The SHA-256 is part of the SHA-2 group, along with SHA-512 and other variants. Currently, only the SHA-2 and SHA-3 groups are considered secure.
Why do they matter?
Conventional hash functions have a wide range of use cases, including database lookups, large files analyses, and data management. On the other hand, cryptographic hash functions are extensively used in information-security applications, such as message authentication and digital fingerprinting. When it comes to Bitcoin, cryptographic hash functions are an essential part of the mining process and also play a role in the generation of new addresses and keys.
The real power of hashing comes when dealing with enormous amounts of information. For instance, one can run a big file or dataset through a hash function and then use its output to quickly verify the accuracy and integrity of the data. This is possible because of the deterministic nature of hash functions: the input will always result in a simplified, condensed output (hash). Such a technique removes the need to store and “remember” large amounts of data.
Hashing is particularly useful within the context of blockchain technology. The Bitcoin blockchain has several operations that involve hashing, most of them within the process of mining. In fact, nearly all cryptocurrency protocols rely on hashing to link and condense groups of transactions into blocks, and also to produce cryptographic links between each block, effectively creating a blockchain.
Cryptographic hash functions
Again, a hash function that deploys cryptographic techniques may be defined as a cryptographic hash function. In general, breaking a cryptographic hash function requires a myriad of brute-force attempts. For a person to “revert” a cryptographic hash function, they would need to guess what the input was by trial and error until the corresponding output is produced. However, there is also the possibility of different inputs producing the exact same output, in which case a “collision” occurs.
Technically, a cryptographic hash function needs to follow three properties to be considered effectively secure. We may describe those as collision resistance, preimage resistance, and second preimage resistance.
Before discussing each property, let’s summarize their logic in three short sentences.
- Collision resistance: infeasible to find any two distinct inputs that produce the same hash as output.
- Preimage resistance: infeasible to “revert” the hash function (find the input from a given output).
- Second-preimage resistance: infeasible to find any second input that collides with a specified input.
Collision resistance
As mentioned, a collision happens when different inputs produce the exact same hash. Thus, a hash function is considered collision-resistant until the moment someone finds a collision. Note that collisions will always exist for any hash function because the possible inputs are infinite, while the possible outputs are finite.
Put in another way, a hash function is collision-resistant when the possibility of finding a collision is so low that it would require millions of years of computations. So despite the fact that there are no collision-free hash functions, some of them are strong enough to be considered resistant (e.g., SHA-256).
Among the various SHA algorithms, the SHA-0 and SHA-1 groups are no longer secure because collisions have been found. Currently, the SHA-2 and SHA-3 groups are considered resistant to collisions.
Preimage resistance
The property of preimage resistance is related to the concept of one-way functions. A hash function is considered preimage-resistant when there is a very low probability of someone finding the input that generated a particular output.
Note that this property is different from the previous one because an attacker would be trying to guess what was the input by looking at a given output. A collision, on the other hand, occurs when someone finds two different inputs that generate the same output, but it doesn’t matter which inputs were used.
The property of preimage resistance is valuable for protecting data because a simple hash of a message can prove its authenticity, without the need to disclose the information. In practice, many service providers and web applications store and use hashes generated from passwords rather than the passwords in plaintext.
Second-preimage resistance
To simplify, we may say that the second-preimage resistance is somewhere in between the other two properties. A second-preimage attack occurs when someone is able to find a specific input that generates the same output of another input that they already know.
In other words, a second-preimage attack involves finding a collision, but instead of searching for two random inputs that generate the same hash, they search for an input that generates the same hash that was generated by another specific input.
Therefore, any hash function that is resistant to collisions is also resistant to second-preimage attacks, as the latter will always imply a collision. However, one can still perform a preimage attack on a collision-resistant function as it implies finding a single input from a single output.
Mining
There are many steps in Bitcoin mining that involves hash functions, such as checking balances, linking transactions inputs and outputs, and hashing transactions within a block to form a Merkle Tree. But one of the main reasons Bitcoin blockchain is secure is the fact that miners need to perform a myriad of hashing operations in order to eventually find a valid solution for the next block.
Specifically, a miner has to try several different inputs when creating a hash value for their candidate block. In essence, they will only be able to validate their block if they generate an output hash that starts with a certain number of zeros. The number of zeros is what determines the mining difficulty, and it varies according to the hash rate devoted to the network.
In this case, the hash rate represents how much computer power is being invested in Bitcoin mining. If the network’s hash rate increases, the Bitcoin protocol will automatically adjust the mining difficulty so that the average time needed to mine a block remains close to 10 minutes. In contrast, if several miners decide to stop mining, causing the hash rate to drop significantly, the mining difficulty will be adjusted, making it easier to mine (until the average block time comes back to 10 minutes).
Note that miners don’t have to find collisions because there are multiple hashes they can generate as a valid output (starting with a certain number of zeros). So there are several possible solutions for a certain block, and miners only have to find one of them – according to the threshold determined by the mining difficulty.
Because Bitcoin mining is a cost-intensive task, miners have no reason to cheat the system as it would lead to significant financial losses. The more miners join a blockchain, the bigger and stronger it gets.
Closing thoughts
There is no doubt that hash functions are essential tools in computer science, especially when dealing with huge amounts of data. When combined with cryptography, hashing algorithms can be quite versatile, offering security and authentication in many different ways. As such, cryptographic hash functions are vital to nearly all cryptocurrencies networks, so understanding their properties and working mechanisms is certainly helpful for anyone interested in blockchain technology.
Critical thinking questions
- How does the deterministic nature of hash functions ensure data integrity and reliability in blockchain technology?
- What are the implications of collision resistance, preimage resistance, and second-preimage resistance for the security of cryptographic hash functions?
- Why is it important for hash functions used in cryptocurrencies like Bitcoin to be one-way functions, and how does this property enhance security?
- How does the adjustment of mining difficulty in Bitcoin mining ensure the stability and security of the blockchain network?
- In what ways can the properties of cryptographic hash functions be applied to secure digital identities and authenticate data in various applications?
Advocacy Initiative:
We encourage participants to begin memorializing the ways they are using time in prison to prepare for success upon release. I encourage participants to create a personal profile by:
- Writing a simple biography
- Writing a daily journal to show all that you’re learning
- Writing book reports that memorialize the books you read
- Writing a release plan to show the ways you’re preparing for success upon release
These strategies helped me immensely once I got out. By using my time wisely inside, I was able to raise capital, build businesses, and succeed in ways that few people would think are possible for someone who served multiple decades in prison. Anyone can do the same—if they prepare first.
If you’d like to follow in the same footsteps, I encourage you to begin building your personal profile. Get started by sending an email message to our team at:
Prison Professors Talent
[email protected]
32565 Golden Lantern, B-1026
Dana Point, CA 92629
Our interns will accept your email invite. You may then send the interns a message such as:
Dear Interns,
My name is xxx, and I am in prison. I would like to begin showing the strategies I am using to prepare for success upon release. Please send me a Release Plan Workbook, and any other books that will help me prepare for the job market. After receiving those workbooks, I will begin building my profile to show others how I am using my time inside to prepare for success outside.
Sincerely,
[Your Name]
Glossary
- Algorithm (noun): A step-by-step procedure or formula for solving a problem or performing a task.
- Bitcoin (noun): A decentralized digital currency that uses cryptography for secure transactions on a blockchain.
- Block (noun): A unit of data containing transaction information, which is added to a blockchain.
- Blockchain (noun): A decentralized digital ledger that securely records transaction data across many specialized computers on the network.
- Collision (noun): The occurrence when two different inputs produce the same hash output.
- Cryptographic Hash Function (noun): A hash function that uses cryptographic techniques to ensure data integrity and security.
- Data Integrity (noun): The accuracy and consistency of data over its lifecycle.
- Deterministic (adjective): Producing the same output from the same input every time.
- Hash (noun): The fixed-size output generated from input data using a hash function.
- Hash Function (noun): A mathematical formula that converts input data into a fixed-size output (hash).
- Hash Rate (noun): The measure of computational power used in cryptocurrency mining.
- Immutability (noun): The characteristic of being unchangeable once recorded.
- Merkle Tree (noun): A data structure used in blockchain to efficiently verify the integrity of data.
- Mining (noun): The process of performing complex calculations to validate transactions and add them to a blockchain.
- One-way Function (noun): A function that is easy to compute in one direction but difficult to reverse.
- Output Size (noun): The fixed size of the hash produced by a specific hash function.
- Preimage (noun): The original input data that is hashed to produce a specific hash.
- Resistance (noun): The difficulty of performing a certain action, such as reversing a hash function.
- SHA-256 (noun): A cryptographic hash function that produces a 256-bit hash, used in Bitcoin.
- Zero (noun): The leading character in a hash that meets the difficulty criteria in Bitcoin mining.