Monday, 25 May 2015

Week 6 (25-31.05.2015) NoSQL vs SQL – Which is a Better Option?



To the growing need for performance and availability of services or sites with heavy traffic, a point often is blocking the database. The relational databases quickly reach their limits and add servers does not increase performance enough. Following this, new technologies have emerged such as NoSQL databases, they are radically changing the architecture of the database that we used to see and thus make it possible to increase performance and availability of services, there are of course "but" the answer is no perfect . Thus Google  migrated NoSQL in 2004 with the engine BigTable. He was followed by the giants of the social web to know Facebook, Twitter and LinkedIn. These companies have migrated all or part of their information system of relational databases to databases like NoSQL.
Also, this technology is relatively new and not yet existing migration processes or formal standards as well as other databases Object, Object relational and hierarchical well known in the scientific community and developers.

https://blog.udemy.com/nosql-vs-sql-2/

Could you please answer  my questions.

1/ Why do we need NoSQL technology?
2/ Why do  main companies use NoSQL not RDBMS in  social media?
3/Do you have any idea how  to develop this technology?

20 comments:

  1. 1/ Why do we need NoSQL technology?

    We need it for the type of data that relational databases struggle with. NoSQL is better at handling with unstructured data, also with data that are stored across many processing nodes and multiple servers. And the companies that you mentioned migrated to NoSQL for this particular benefit since the data that they stockpile for their services are of tremendous amounts and the millions of users accessing them each day enforce the use of many processing nodes and servers.

    2/ Why do main companies use NoSQL and not RDBMS in social media?

    I answered this question in my reply to your previous one. I would also want to add that some of the social media companies also archive more data than necessary for possible future use.

    3/ Do you have any idea on how to develop this technology?

    I assume you mean the methods of migrating from RDBMS to NoSQL. If I already had the answer to this question, I think I would be rich. :)
    This article
    provides some information about the matter.

    ReplyDelete
    Replies
    1. 3. I recently had a discussion with a database specialist on this subject (migrating from RDBMS to something else / better) and she said that it's extremely hard to transition. She believes that it's one of the main reasons why relational databases are still so popular - companies are stuck with them because they can't move their data.

      Delete
    2. This comment has been removed by the author.

      Delete
    3. thank you so much for your response Michail, Please note that, we haven't for the moment a universal method for migration from RDBMS to NoSQL, because for every case and every data, we have other type of NoSQL technology(graph database, key-value database,document oriented database,oriented columns database)

      and if you have some solution for this problem you can be rich realy ;)

      Delete
    4. Łukasz, you have right, it's very difficult to migrate the RDBMS to other database technology but not impossible. but firstly you must know why you need to move your data to other technology.

      Delete
  2. Hi Monem,
    Thanks for posting this article, I feel a bit wiser after reading it.

    1. Why do we need NoSQL technology?
    I think the main reason for this is that it’s hard to store BigData in relational databases. According to this article, NoSQL enables you to store huge amounts of unstructured data on multiple servers with no performance penalties (or much smaller penalties than in case of relational databases).

    2. Why do social media companies use NoSQL instead of RDBMS?
    My guess is that they’re storing and gathering insane amounts of data, which probably can’t be stored in traditional databases. Even if it could be stored, probably it would be very hard to process. Also, according to the article, the maintenance costs of NoSQL servers are lower.

    3. Do you have any idea on how to develop this technology?
    Unfortunately I always avoided database-related topics, so I’m far from being a database expert. Luckily most of the data I’m processing can be stored on a single hard drive :) So the answer is no, I only know some basic SQL.

    ReplyDelete
    Replies
    1. I just remembered that I know something else: SQL can be either pronounced as S-Q-L or 'sequel'. There, it's all I know, please use this knowledge wisely.

      Delete
    2. In that case "no sequel" implies that it wasn't very good and it didn't get a sequel ;)

      Delete
    3. Sequel :) I don't know this before, thank you Łukasz. but realy the the strong concept of NoSQL databases are mainly based on several aspects that are strengths and justify their use today for web giants. This is mainly horizontal partitioning data, since these bases are without schema and therefore flexibility of the data schema.

      Delete
    4. Yep, they removed all the vowels from 'sequel' because of a trademark violation. Trademark law is funny.

      Delete
  3. 1/ Why do we need NoSQL technology?

    This is good question for those like me who don't know this technology. I have some questions about performance, but I didn't make unfortunately any deep research in those field.

    2/ Why do main companies use NoSQL not RDBMS in social media?

    Maybe they are cuting cost but I doubt is it worth.:)

    3/Do you have any idea how to develop this technology?

    I will study first and then think about any more upgrades.:)

    ReplyDelete
  4. 1/ Why do we need NoSQL technology?

    The companies dealing with large amounts of data need it. It was a solution to a growing problem of storing and managing more and more data. Instead of waiting for new standards to develop, they decided they can do it themselves and that's how this technology came to be.

    2/ Why do main companies use NoSQL not RDBMS in social media?

    Because the ultimate goal of social media is to reach as many users as possible (this is the data handling problem). The other reason is that it is a quickly evolving and very competitive field, so it is important to be able to deploy flexible solutions and take advantage of opportunities as they emerge.

    3/Do you have any idea how to develop this technology?

    If we talking about how to develop it further, I would say some level of standardization is the next step. Developing tools to expand the capabilities of this environment is also necessary. The article mentions "facilities for ad-hoc query and analysis".

    ReplyDelete
    Replies
    1. Thank you so much Wiktor, about the query in NoSQL.
      In the world of NoSQL there is no standard language like SQL is in the world of relational databases. The query NoSQL databases at the application level is mainly through the technique called "MapReduce." MapReduce is a distributed programming technique widely used in the NoSQL community and which aims to produce distributed queries. This technique consists of two main stages
      - Mapping Step: Each item of a list of key-value objects passed to the map function that will return a new key-value element. Examples of map function. Each pair (UserId, User), we assign the pair (Role, User). After the stage of mapping, a list of users grouped by role is obtained. At a couple (UserId, User), we assign the pair (UserId, User) only if the user email ends with ".fr"

      -Step to Reduce: The reduce function is called on the result of the mapping step and can apply an operation on the list. Examples reduce function:
      - Average values in the list
      - Recognition of the different keys of the list
      - Recognition of the number of entries per key in the list
      The mapping step can be parallelized by treating the application on different nodes of the system for each key-value pair. The reduction step is not parallelized and can not be executed before the end of the mapping step. NoSQL databases offer various implementations of MapReduce technique most often develop map and reduce methods in JavaScript
      or Java.

      Delete
  5. 1/ Why do we need NoSQL technology?

    Like every technology we need it for particular reasons. NoSQL is large term, which concerns in example issues from hyper graph’s world. In this example it is just easier to represent graph structure as map of graph vertices in NoSQL databases. However according to newest Google’s discoveries, maybe it is more handy to store graphs in NoSQL, but RDBMS gives better performance outputs for traversing it. So the latest news is that the best way is to combine those two methodologies and store graph structure in NoSQL and its map data in RDBMS.

    2/ Why do main companies use NoSQL not RDBMS in social media?

    Because social structure is the one of the best example of hyper graph. So the type of data structure perfectly fits to NoSQL methodology. Social network is just a map of nodes with connecting edges. It is also huge data source that can be by its nature easily divided into parts, so that related nodes can be stored on different machine storage without losing the integrity. I guess also most of current NoSQL frameworks provides DSP (deep search first) and BSP (breadth search first) functions out of box which is very useful for this type of data. So looks like NoSQL has been perfectly cut for social network needs.

    3/Do you have any idea how to develop this technology?

    As we heard recently the latest development is combination of NoSQL and RDBMS. It make sense for me as NoSQL is yet untrusted technology, in contrast to mature RDBMS supported by at least two decades of development. I guess during that time we faced almost every possible RDBMS problem including performance considerations, so we can now join this knowledge to NoSQL approach. However I’m finding it quite funny - I think that vulnerabilities of RDBMS stood party at the basis of NoSQL invention and now RDBMS helps improve NoSQL issues.

    ReplyDelete
  6. 1/ Why do we need NoSQL technology?

    Like every technology we need it for particular reasons. NoSQL is large term, which concerns in example issues from hyper graph’s world. In this example it is just easier to represent graph structure as map of graph vertices in NoSQL databases. However according to newest Google’s discoveries, maybe it is more handy to store graphs in NoSQL, but RDBMS gives better performance outputs for traversing it. So the latest news is that the best way is to combine those two methodologies and store graph structure in NoSQL and its map data in RDBMS.

    2/ Why do main companies use NoSQL not RDBMS in social media?

    Because social structure is the one of the best example of hyper graph. So the type of data structure perfectly fits to NoSQL methodology. Social network is just a map of nodes with connecting edges. It is also huge data source that can be by its nature easily divided into parts, so that related nodes can be stored on different machine storage without losing the integrity. I guess also most of current NoSQL frameworks provides DSP (deep search first) and BSP (breadth search first) functions out of box which is very useful for this type of data. So looks like NoSQL has been perfectly cut for social network needs.

    3/Do you have any idea how to develop this technology?

    As we heard recently the latest development is combination of NoSQL and RDBMS. It make sense for me as NoSQL is yet untrusted technology, in contrast to mature RDBMS supported by at least two decades of development. I guess during that time we faced almost every possible RDBMS problem including performance considerations, so we can now join this knowledge to NoSQL approach. However I’m finding it quite funny - I think that vulnerabilities of RDBMS stood party at the basis of NoSQL invention and now RDBMS helps improve NoSQL issues.

    ReplyDelete
  7. 1/ Why do we need NoSQL technology?

    Nowadays, we are the witnesses of social media breakthrough. People exchange more and more data using Facebook, Twitter, Instagram and other social media. Access to the net gives possibilities to produce fast and dirty (full of mistakes) information by using mobile devices mainly. It causes that the attempts of putting data into some structures are getting out hand. In my opinion NoSQL tech is more flexible to the needs of current information world. More features is listed in the answer to the second question.

    ReplyDelete
  8. 2/ Why do main companies use NoSQL not RDBMS in social media?

    Having read the paper I`d like to sum the features of NoSQL up.
    First of all high-performance of NoSQL because of agile processing.
    Social media produces more and more data, so dealing with the big data processing is required – the NoSQL tech gives such possibilities.
    What is more we can store, have an access and analyse data of different kind (unstructured data) not like in SQL
    Moreover, one of the tech advantages is distributed architecture on multiple servers, so it gives possibilities of multiple processing.
    Finally, according to the paper maintaining NOSQL servers is cheaper than RDBMS.

    ReplyDelete
  9. 3/Do you have any idea how to develop this technology?

    If we want to develop this technology, first of all it has to gain more popularity. The RDBMS is connected with SQL history, so they have been developing since 1960. In my opinion the mixture of both tech, I mean SQL + NoSQL, is the first step to popularise the second one. To many companies run on RDBMS and from financial point of view they are not able to change the technology.

    ReplyDelete
  10. 1/ Why do we need NoSQL technology?
    > key-value stores are much easier to scale
    > short learning curve makes it more desirable
    > although it's arguable, scenario dependent performance advantages
    > consumes lesser amount of system resource (i,e, no need for script compilation)

    2/ Why do main companies use NoSQL not RDBMS in social media?
    > Scalability is very important for vast amount of social media
    > More eligible to divide and conquer (map reduce) strategies

    3/Do you have any idea how to develop this technology?
    > Yes and No. There is nothing magical about developing a NoSQL DB. The devil is in the details. Performance + Stability + Persistency (maybe) ..

    ReplyDelete
  11. These are some hard questions to answer to me, as my domain is completely different and I only have an understanding of data bases on a very conceptual level. I will do my best.


    1/ Why do we need NoSQL technology?

    There is a vast, growing amount of data, collected within our internet of things, therefore we need new approaches to data management, such as NoSQL and other parallel system. From my understanding of these systems - as I said on a very conceptual level - the difference between the traditional data base and a parallel one is approximately the same as the difference between information and knowledge management within the socio-economic system (this is my domain).

    I understand that in the parallel systems the data is not stored in one vast repository, but is distributed through the network of databases, kept flexibly in different nodes and a mapped real-time. This process is based on decentralizing data. The similarity to the socio-economic system is very visible. In the modern days of globalization it is impossible for one institution, situated at the top of the hierarchy, to hold all the knowledge needed to take decisions and it is impossible for it to make decisions top-down. The knowledge has to be decentralized and kept in different nodes - meaning it is delegated to lower hierarchy managers and so is the power. The relations within such a system become more flat and adaptive.

    2/ Why do main companies use NoSQL not RDBMS in social media?

    Probably because social media is a network based on adaptive and flat relations, and the knowledge created by one specific social community (node) is different to knowledge created by a different one. If one vast repository was created to hold all the knowledge (data) it would be impossible to create universal rules to manage it.

    3/Do you have any idea how to develop this technology?

    Certainly it has to be developed based on human imput and specific human needs. If the technology is developed based only on its own boundaries, without taking into consideration the subjective, human points of view, we end up in a technocratic world, with systems that do not accept any exception, that have no empathy for us and that turn societies into efficient machines.

    ReplyDelete