Minor Presentation - Distributed file system

14
Distributed file system with storage optimization through CUDA using disjunctive property of π (pi)

description

Distributed file System with storage optimization through CUDA using disjunctive property of pi

Transcript of Minor Presentation - Distributed file system

Page 1: Minor Presentation - Distributed file system

Distributed file system with storage optimization through CUDA using disjunctive property of π (pi)

Page 2: Minor Presentation - Distributed file system

Why π ?

● π is a normal number….A real number whose infinite sequence of digits in every base b is distributed uniformly

● ..which makes it a disjunctive sequenceinfinite sequence in which every finite string appears as a substring

π163.243f6a8 885a308d 313198a 2e0370 7344a40 93822299 f31d0082 efa98ec 4e6c8945 2821e6 38d013 77be54 66cf34 e90c6c c0ac2 9b7c97c 50dd3f84 d5b5b54 7091792 16d5d98 979fb1 bd131………

Page 3: Minor Presentation - Distributed file system

Every file in π !!

Why waste exabytes of storage data on files when they are already in π !!

That’s right. Every file you've ever created, or anyone else has created or will create, right there in π

Page 4: Minor Presentation - Distributed file system

Storing data

Storing of data is multi-step process● Dividing the files into blocks

Can be fixed sized such as 1 byte or dynamic sized

● Searching of blocks in πEither Using:

● Gauss-Legendre algorithm or

● Browein’s Algorithm

Page 5: Minor Presentation - Distributed file system

So how does it look like?

File Block Index Length

file1.txt block 1 234334556465 23

block 2 3245364564 234

block 3 3422323789472 12

Page 6: Minor Presentation - Distributed file system

Retrieving

● Looking at the index tables for the metadata and

● Regenerating the value of π at those decimal places.

Bailey–Borwein–Plouffe formula -computing the nth binary digit of π using base 16 math

Page 7: Minor Presentation - Distributed file system

So how does it look like?

file.txt : 10101001011010000010001110111110

File Block Index Length Value in pi (after BBP formula)

file.txt Block 1 2131231 10 ......1010100101…...

Block 2 234234 17 ……...10100000100011101…

Block 3 12345 5 ……….11110……...

Page 8: Minor Presentation - Distributed file system

Problems

● High Computation required

A 400 line text file takes about 5 minutes to save on a 2GHz Intel i3

processor

● Size of index may be very large

The size of index may turn out to be larger than file itself

Page 9: Minor Presentation - Distributed file system

Moore’s LawThe number of transistors in a dense integrated circuit doubles approximately every two years

May take nearly 20 years for this to be used on Personal Computers.

Page 10: Minor Presentation - Distributed file system

Distribution over cloud

1- 10000 10000-40000 40000-100000 …..

π

Page 11: Minor Presentation - Distributed file system

Using NVIDIA GPU Cluster

NVIDIA using CUDA can give a significant increase in the performance using thousands of GPU - cores.

Page 12: Minor Presentation - Distributed file system

File System

Using FileSystem in User-Space (FUSE) libraries.The normal filesystem operations such as

● Change owner● Change mode● Locking● Deleting.. etc.

are still the same.

Thus the use of a Virtual File System will be done,

without changing the kernel code.

Page 13: Minor Presentation - Distributed file system

● The project can be used as a cloud storage system with high disk space optimization

● Transmission of meta data of data stored in pi over the network.

● Cold Storage systems where data which can’t be thrown away, but may not be accessed for many years is stored.

Goals/Application

Page 14: Minor Presentation - Distributed file system

Thank You !!

Umar Ahmad Vipul Nayyar (11CSS - 66) (11CSS - 68)