Big Data in an easy way

Ritesh Singh
3 min readSep 16, 2020

Question: What is Big Data?

Answer: Big Data is a problem.

  • Let assume per day Google is receiving 1000Gb Data, So they have to store the data in Storage devices, but we all know Google, Facebook, etc, and many companies receiving n -timesGB’s, Tb’s Data from the client.
  • So Where They Store, they store the data in storage devices, but how bigger HDD(Storage Devices) they need to store such a large amount of data.
  • Even though Google, Amazon, Facebook are Top companies and they can also afford the high cost of Storage Devices, but after that one more problem come i,e I/O.
  • Let’s suppose 1Gb of file takes 1min to read and write the data, but Google has 1000GB size of the file, so it takes around 1000min(16hr) to read the data, but we all know the size of Google Data is it very big, So if we go on Google and search anything, and Google is saying please wait for 2–3days we are reading data because our size of Data is very Big.

Problems: Velocity, Volume, Persistent

Solution: Distributed Storage.

Distributed Storage:

  • Our Data distribute in blocks(server).
  • Our Data is distributed on different servers.
  • Now what happens in this scenario our Data distributed, Let take one example from the above scenario 1GB take 1min to write the Data on HDD(Storage devices) if we distributed our data it takes less time because 1GB takes 1 min now it is distributed in three so 1min/3=20sec, so now our time has been decreased from 1 min to 20 sec, and if we add more and more server it’ll take less time to writing and reading and Now, we achieve speed, volume and persistent.

Distributed Storage Cluster

  • On this basis, many tools are working.
  • In this Distributed Storage Cluster, there are N-numbers of slaves and they are connected to the Master.
Master-Slave Model
  • Tools are like Hadoop, Cassandra, Drill, etc.

For more follow the given Link:

Article from: https://techcrunch.com/2012/08/22/how-big-is-facebooks-data-2-5-billion-pieces-of-content-and-500-terabytes-ingested-every-day/#:~:text=Facebook%20revealed%20some%20big%2C%20big,of%20data%20each%20half%20hour.
Article from: https://www.heshmore.com/how-much-data-does-google-handle/#:~:text=Google%20doesn't%20hold%20the,across%20its%20massive%20computing%20clusters.

--

--