What is Small Data and Data?

  • There is a movement towards more personal, subjective analysis of chunks of data ,termed “Small Data".
  • Small data is a data set whose volume and format allows its processing and analysis by a person or a small organization. Thus,instead of collecting data from several sources, with different formats, & generated at increasing velocities, creating large data repositories & processing facilities, small data favors the partition of a problem into small packages, which can be analyzed by different people or small groups in a distributed & integrated way.
  • People are continuously producing small data as…


As data increase in size,velocity and variety,new computer technologies becomes necessary .These new technologies,which include hardware and software,must be easily expanded as more data are processed. this property is known as Scalability

Even,if processing power is expanded by combining several computers in a cluster,creating a distributed system,conventional software for distributed systems usually cannot cope with big data.

One of the limitations is the efficient distribution of data among the different processing and storage units.to deal with these requirements,new software tools & techniques have been developed.

MapReduce is a programming model which is divided into parts — chunks — and stores in the memory of each cluster computer the cluster of the data set needed by this computer to accomplish its processing task.


What is Big Data and

Data Science ?

  • Big data a technology for data processing, was initially defined by the three V’s.
  • They are Volume ,Variety,an Velocity.
  • Volume is concerned with how to store a big data : data repositories for large amounts of data.
  • Variety is concerned with how to put together data from different sources.
  • Velocity is the ability to deal with data arriving very fast ,in streams knows as data streams.
  • Analytics is also about discovering knowledge from data stream, going beyond the velocity component of big data.
  • Big data are data sets that are too large…


  • The science that analyze crude data to extract useful knowledge (patterns)from them.
  • This process can also include data collection, organization, pre-processing, transformation, modeling and interpretation.
  • The idea of generalizing knowledge from a data sample comes from a branch of statistics known as inductive learning.
  • With the advances of personal computers,computational capacity has been used to develop new methods .
  • The term machine learning (ML) gives ability to learn without being explicitly programmed.
  • A new term appeared with a different slight meaning:data mining (DM).
  • Companies start to collect more and more data,aiming to either solve or improve business operations, for example…


What we can do with data?

  • Until recently researchers working with data analytics were struggling to obtain data for their experiment.
  • Recent advances in the technology of data processing, data storage and data transmission, associated with advanced and intelligent computer software, reducing costs and increasing capacity, have changed this scenario.
  • Each day,a larger quantity of data is generated and consumed.
  • Whenever you place a comment in your social network,upload a photograph, some music or a video,navigate through the internet, or add a comment to an e-commerce web site,you are contributing to the data increase.
  • These data provide a rich source…

40 manali somani

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store