| Hosted by CoCalc | Download
Kernel: SageMath 9.0

Math 157: Intro to Mathematical Software

Number Theory and Distributed Ledger Technology(Block Chain)

Introduction

In this lecture, I will discuss about what 'Distributed Ledger Technology' is, application of distributed and the mathematics behind it. In the end, there will be short exercise regarding the technology used to build DLT

Brief History of Ledger

Before we talk about the what is a Distributed Ledger Technology, it is important to talk about what a ledger is and the history of it. In a nutshell, a ledger is a bookkeeping of transactions, mainly used for accounting purposes. The purpose is to record fiancial related history such as money transactions, exchanges regarding assets and etc that cannot be tampered with. According to https://bitemycoin.com/opinion/ledger-nomics/ first ever recorded ledger was found in Mesopotamia around 5000 BC (roughly 7000 years ago!) It's purpose was to act as centralized bank of recording transactions.

Decentralization of Data and Blockchain

If you are a customer of a bank, i.e) JP Morgan Chase (where I bank at), J.P Morgan Chase has centralized database that is secured and encrypted that is only available to you. This is a centralized ledger system where a user or a customer goes through the source that they trust in order to access information, disclosed to the public. However, despite security and government backed insurance such as FDIC, there are vulnerabilities to centralized ledgers. According to https://www.businessinsider.com/jpmorgan-hacked-bank-breach-2015-11, in 2014, JP Morgan Chase was a victim of largest theft of customer data in the history of US financial institutions, where data of over 100 million customers were breached and stolen.

With the evolution of internet, this is where decentralization of data/information base comes in. People are not only able to access database regarding informations such as news through online platforms, but people are also able to produce and distribute information. https://www.wikipedia.org/ is a great example of user created and distributed information and content. Emerging of the internet allowed decentralization of the information and database and recording of transaction(ledger) is not exempt from it either.

According to "TED Talks: The Blockchain Explained Simply" https://www.youtube.com/watch?v=KP_hGPQVLpA Wikipedia has 270,000 editors updating information on the web. Applying this to ledger, technology known as "Distributed Ledger" was born in 1991 by Scott Stornetta and Stuart Haber according to https://cryptonews.com/news/a-key-insight-for-the-blockchain-came-in-1991-1895.htm and https://cryptonews.com/news/a-key-insight-for-the-blockchain-came-in-1991-1895.htm in an attempt to preventing the tampering of information and record from past. The focus was "Immutable recordings of data"

Currently, platforms such as Wikipedia has thousands of editors, updating latest information by each minute by creating pages for new information or editing pages in currently existing pages. How does this work in decentralized "Distributed Ledger" ??? In distributed ledger technology, you are able to add a "block" that consists of information regarding transcations availabe to public to existing "chain" of blocks. Such ditributed ledger technology is called "Blockchain" and there are thousands of computers with powerful computing capability called "miners" that mine such data to build blocks to add to ledgers that are shared with public.

Blockchain and Bitcoin

In the previous "block" of text that I wrote previously, I briefly explained how Distributed Ledger Technology works and the application of it, which is Blockchain. On this "block" of information, I will talk more in details of "Blockchain".

In a nutshell, blockchain is a decentralized network of chains of ledgers that contains transaction between users within the network, that is available to the public. Powerful computers known as "miners" collect information regarding transactions, create them into a block, linking them to existing blocks, which formed chained ledger, available to the public.

Currently, one of the most significant technology that applies distributed ledger technology is Bitcoin. According to https://en.wikipedia.org/wiki/Bitcoin, Bitcoin is a cryptocurency that is decentralized, digital currency that allows peer to peer transcation. The record of data uses blockchain, where the "miners" use very complicated mathematics to mine authentic bitcoin transactions, where they are used in creating blocks. The details will be explained a bit later.

As an example, https://www.blockchain.com/explorer is where you can actually look up the latest blocks that were chained to existing blocks in the network. If you click on one of them, you're able to see the content of the block. Here is an example of an actual block that was mined recently, as I was writing this "block" https://www.blockchain.com/btc/block/000000000000000000103ce786f1559c03b2d8cb8c5baf6f3abebd57767c6cc0

Each block contains following important informations:

  • Hash

  • Number of transactions

  • Transaction volume (by number of Bitcoins it's sum of transactions are worth)

  • Block Reward (Number of Bitcoins )

Preliminaries before the we go into details

Before we go into details, you might be wondering, how does one take care of possible fradulent recording of transactions??? This is where Hashing comes in, which is one of the most important features of blockchain that guarantees the authentication of the transactions that it records.

Hash

In a nutshell, hash in a block is the hashing of the information of previous block. In each blocks, in order to authenticate, it must include hash code that outputs precisely all of exact information from previous block that current block is going to be chained to. This is a critical feature because this is what allows blocks to be chained. Each miners in the network holds copies of these blocks and it gets updated every few minutes to 10 minutes. Hashing is what chains all the block of informations together and decentralized networks of authentication through all miners having multiple copies, authenticating blocks of information is key to prevent fraud. Blocks all contain hash information, where it uses SHA-256 hashing algorithm. It's an algorithm of 256 bit binary string system that outputs to specific output(The information that holds the authentic transaction data of each blocks) As learned in class and various sources such as https://www.youtube.com/watch?v=IGSB9zoSx70 these encryption technology is "infeasible" to guess the keys. 2 to the 256 possibilities take ages. Another advantage is that because all miners within the linked network has updated copies of the blocks itself, few groups of hackers would not be able to tamper with information recorded in the previous. Any change or addition of infromation requires update of all existing blocks through such hashing and it requires that majority of the copies of entire blockchain ledger must be identical to majority of the copies in the network. This process ensure the authenticity of each information. Hashing takes a lot of computing power and time but each miners that put together an authenticated block receives lucrative rewards in bitcoin, which promotes flow of truthful information and outcomputes those who want to tamper with transaction to commit fraud. SHA-256 algorithm also guarantees that the output of each hashing is collison free, meaning each hash code guarantees distinct and deterministic output regarding the information from previous block every time.

Number of transactions

Each block in the chain also contains number of transactions it has collected, details of the transactions are also stored in the block if you look back to example block chain https://www.blockchain.com/btc/block/000000000000000000103ce786f1559c03b2d8cb8c5baf6f3abebd57767c6cc0 if you scroll to the bottom. You can also see the transaction address of receiver and sender of the Bitcoin at the time of the transaction, which basically says "John gave sara 0.00001 Bitcoins" as a hypothetical address.

Transaction volume

Transaction volume simply indicates the size of the "economy" of the sum of transaction gathered by the minor for that block. It is calculated by the total number of Bitcoins that was involved in the sum of transactions within the block.

Block reward

Block reward is simply a token system, an incentive system to encourage users to be involved in building the network of blockchains. It is rewarded to miners who mine information regarding individual transactions to mine authenticated new transaction data using mathematics that I will talk about in details later, involving eliptical curve to put together blocks to be added to the distributed ledger network. You can see that each block contains some kind of reward in number of Bitcoins.

As a side note, as there are more blocks mined and found, the reward is reduced over time and number of Bitcoins mined will be capped to 21 million BTC according to many sources such as https://www.youtube.com/watch?v=bBC-nXj3Ng4 which was decided by the original creator known as "Satoshi Nakatomo"

Mathematics behind Bitcoin - Elliptic Curve

Elliptic Curve is used in cryptography in order to come up with combination of public and private key to access information. This also applies to Block chain as well

Accoring to https://en.wikipedia.org/wiki/Elliptic_curve ellipitic curve takes a in a following form of: y^2 = x^3 + a*x + b:

E = EllipticCurve([1,2,3,4,5]) E
Elliptic Curve defined by y^2 + x*y + 3*y = x^3 + 2*x^2 + 4*x + 5 over Rational Field
plot(E)
Image in a Jupyter notebook
# Some eliptical curve created from Cremona label E = EllipticCurve('37b3') E
Elliptic Curve defined by y^2 + y = x^3 + x^2 - 3*x + 1 over Rational Field
plot(E)
Image in a Jupyter notebook

The elliptical curve used for Bitcoin is called "secp256k1" which is y^2 = x^3 + 7

secp256k1 = EllipticCurve([0,7]) secp256k1
Elliptic Curve defined by y^2 = x^3 + 7 over Rational Field
plot(secp256k1)
Image in a Jupyter notebook

ECC(Elliptical Curve Cryptography) vs RSA

We learned what RSA does in class for encryption for Diffie-Hellman, but there is a significant advantage to generating keys with more efficiency with same level of security when using ECC. You can generate same amount of keys with same level of security with numbers involving 256 bits needed in ECC, where as it would take you 3072 bits needed in RSA. This is a significant advanatage where the computation becomes much more efficient to generate secured public and private key pairs.

Group Law (Basic idea before exercise)

The heart of the key generating of public and private key is using group law https://en.wikipedia.org/wiki/Elliptic_curve#The_group_law. The basic idea is that if you pick a two points, P and Q on a cubical elliptical curve, you can describe the points P+Q. It's basically "dot" the two points, it would yield third point. This is known as the "group law" where you add two points on a cubical elliptical curve such as y^2 = x^3+7 in order to get 3rd point on the curve.

Exercise 0:

generate a new eliptical function y^2 + y = x^3 + x + 7

Exercise 1:

Write the function where using the function defined from exericse 0, that generates slope of the line with two distinct coordinates on the given curve(could be three points. Given the nature of elliptical curve and the group law, think why this is true)

HINT: two distinct coordinates p and q, returns the slope of that coordinates of the the generated coordinates.

Exercise 2:

Given any coordinate, write a function that returns true if the given (x,y) is on the eliptical curve defined from exercise 0

Solution for exercise 0:

myCurve = EllipticCurve([0,0,1,1,7]) myCurve
Elliptic Curve defined by y^2 + y = x^3 + x + 7 over Rational Field

Solution for exercise 1:

def slope(xp, xq): if xp == xq: return "you cannot have 0 divisor" yp = pow(xp^3 + x +7, 0.5) + pow(yp,0.5) yq = pow(xq^3+ x + 7, 0.5) + pow(yq,0.5) yp = pow(yp,2) yq = pow(yq,2) return (yp - yq)/(xp - xq)

Solution for exercise 2:

def isOnTheCurve(x, y): print(y^2, x^3) return (y^2 == x^3+ x + 7 - y)