Number Theory and Distributed Ledger Technology(Block Chain)
In this lecture, I will discuss about what 'Distributed Ledger Technology' is, application of distributed and the mathematics behind it. In the end, there will be short exercise regarding the technology used to build DLT
Brief History of Ledger
Before we talk about the what is a Distributed Ledger Technology, it is important to talk about what a ledger is and the history of it.
In a nutshell, a ledger is a bookkeeping of transactions, mainly used for accounting purposes. The purpose is to record fiancial related history such as money transactions, exchanges regarding assets and etc that cannot be tampered with. According to https://bitemycoin.com/opinion/ledger-nomics/ first ever recorded ledger was found in Mesopotamia around 5000 BC (roughly 7000 years ago!) It's purpose was to act as centralized bank of recording transactions.
Decentralization of Data and Blockchain
If you are a customer of a bank, i.e) JP Morgan Chase (where I bank at), J.P Morgan Chase has centralized database that is secured and encrypted that is only available to you. This is a centralized ledger system where a user or a customer goes through the source that they trust in order to access information, disclosed to the public. However, despite security and government backed insurance such as FDIC, there are vulnerabilities to centralized ledgers. According to https://www.businessinsider.com/jpmorgan-hacked-bank-breach-2015-11, in 2014, JP Morgan Chase was a victim of largest theft of customer data in the history of US financial institutions, where data of over 100 million customers were breached and stolen.
With the evolution of internet, this is where decentralization of data/information base comes in. People are not only able to access database regarding informations such as news through online platforms, but people are also able to produce and distribute information. https://www.wikipedia.org/ is a great example of user created and distributed information and content. Emerging of the internet allowed decentralization of the information and database and recording of transaction(ledger) is not exempt from it either.
Currently, platforms such as Wikipedia has thousands of editors, updating latest information by each minute by creating pages for new information or editing pages in currently existing pages. How does this work in decentralized "Distributed Ledger" ??? In distributed ledger technology, you are able to add a "block" that consists of information regarding transcations availabe to public to existing "chain" of blocks. Such ditributed ledger technology is called "Blockchain" and there are thousands of computers with powerful computing capability called "miners" that mine such data to build blocks to add to ledgers that are shared with public.
Blockchain and Bitcoin
In the previous "block" of text that I wrote previously, I briefly explained how Distributed Ledger Technology works and the application of it, which is Blockchain. On this "block" of information, I will talk more in details of "Blockchain".
In a nutshell, blockchain is a decentralized network of chains of ledgers that contains transaction between users within the network, that is available to the public. Powerful computers known as "miners" collect information regarding transactions, create them into a block, linking them to existing blocks, which formed chained ledger, available to the public.
Currently, one of the most significant technology that applies distributed ledger technology is Bitcoin. According to https://en.wikipedia.org/wiki/Bitcoin, Bitcoin is a cryptocurency that is decentralized, digital currency that allows peer to peer transcation. The record of data uses blockchain, where the "miners" use very complicated mathematics to mine authentic bitcoin transactions, where they are used in creating blocks. The details will be explained a bit later.
Each block contains following important informations:
Number of transactions
Transaction volume (by number of Bitcoins it's sum of transactions are worth)
Block Reward (Number of Bitcoins )
Preliminaries before the we go into details
Before we go into details, you might be wondering, how does one take care of possible fradulent recording of transactions??? This is where Hashing comes in, which is one of the most important features of blockchain that guarantees the authentication of the transactions that it records.
In a nutshell, hash in a block is the hashing of the information of previous block. In each blocks, in order to authenticate, it must include hash code that outputs precisely all of exact information from previous block that current block is going to be chained to. This
is a critical feature because this is what allows blocks to be chained. Each miners in the network holds copies of these blocks and it gets updated every few minutes to 10 minutes. Hashing is what chains all the block of informations together and decentralized networks of authentication through all miners having multiple copies, authenticating blocks of information is key to prevent fraud. Blocks all contain hash information, where it uses SHA-256 hashing algorithm. It's an algorithm of 256 bit binary string system that outputs to specific output(The information that holds the authentic transaction data of each blocks) As learned in class and various sources such as https://www.youtube.com/watch?v=IGSB9zoSx70 these encryption technology is "infeasible" to guess the keys. 2 to the 256 possibilities take ages. Another advantage is that because all miners within the linked network has updated copies of the blocks itself, few groups of hackers would not be able to tamper with information recorded in the previous. Any change or addition of infromation requires update of all existing blocks through such hashing and it requires that majority of the copies of entire blockchain ledger must be identical to majority of the copies in the network. This process ensure the authenticity of each information. Hashing takes a lot of computing power and time but each miners that put together an authenticated block receives lucrative rewards in bitcoin, which promotes flow of truthful information and outcomputes those who want to tamper with transaction to commit fraud. SHA-256 algorithm also guarantees that the output of each hashing is collison free, meaning each hash code guarantees distinct and deterministic output regarding the information from previous block every time.
Transaction volume simply indicates the size of the "economy" of the sum of transaction gathered by the minor for that block. It is calculated by the total number of Bitcoins that was involved in the sum of transactions within the block.
Block reward is simply a token system, an incentive system to encourage users to be involved in building the network of blockchains. It is rewarded to miners who mine information regarding individual transactions to mine authenticated new transaction data using mathematics that I will talk about in details later, involving eliptical curve to put together blocks to be added to the distributed ledger network. You can see that each block contains some kind of reward in number of Bitcoins.
As a side note, as there are more blocks mined and found, the reward is reduced over time and number of Bitcoins mined will be capped to 21 million BTC according to many sources such as https://www.youtube.com/watch?v=bBC-nXj3Ng4 which was decided by the original creator known as "Satoshi Nakatomo"
Mathematics behind Bitcoin - Elliptic Curve
Elliptic Curve is used in cryptography in order to come up with combination of public and private key to access information. This also applies to Block chain as well
Elliptic Curve defined by y^2 + x*y + 3*y = x^3 + 2*x^2 + 4*x + 5 over Rational Field
# Some eliptical curve created from Cremona labelE=EllipticCurve('37b3')E
Elliptic Curve defined by y^2 + y = x^3 + x^2 - 3*x + 1 over Rational Field
The elliptical curve used for Bitcoin is called "secp256k1" which is y^2 = x^3 + 7
Elliptic Curve defined by y^2 = x^3 + 7 over Rational Field
ECC(Elliptical Curve Cryptography) vs RSA
We learned what RSA does in class for encryption for Diffie-Hellman, but there is a significant advantage to generating keys with more efficiency with same level of security when using ECC. You can generate same amount of keys with same level of security with numbers involving 256 bits needed in ECC, where as it would take you 3072 bits needed in RSA. This is a significant advanatage where the computation becomes much more efficient to generate secured public and private key pairs.
Group Law (Basic idea before exercise)
The heart of the key generating of public and private key is using group law https://en.wikipedia.org/wiki/Elliptic_curve#The_group_law.
The basic idea is that if you pick a two points, P and Q on a cubical elliptical curve, you can describe the points P+Q. It's basically "dot" the two points, it would yield third point. This is known as the "group law" where you add two points on a cubical elliptical curve such as y^2 = x^3+7 in order to get 3rd point on the curve.
generate a new eliptical function y^2 + y = x^3 + x + 7
In [ ]:
Write the function where using the function defined from exericse 0, that generates slope of the line with two distinct coordinates on the given curve(could be three points. Given the nature of elliptical curve and the group law, think why this is true)
HINT: two distinct coordinates p and q, returns the slope of that coordinates of the the generated coordinates.
In [ ]:
Given any coordinate, write a function that returns true if the given (x,y) is on the eliptical curve defined from exercise 0
In [ ]:
Solution for exercise 0:
Elliptic Curve defined by y^2 + y = x^3 + x + 7 over Rational Field
Solution for exercise 1:
defslope(xp,xq):ifxp==xq:return"you cannot have 0 divisor"yp=pow(xp^3+x+7,0.5)+pow(yp,0.5)yq=pow(xq^3+x+7,0.5)+pow(yq,0.5)yp=pow(yp,2)yq=pow(yq,2)return(yp-yq)/(xp-xq)