Ok. You may have known from the intro article that I joined PingCAP, the company that founded TiDB. The first question in mind may be, what is TiDB?
Well, I did my homework during the interview that
Alright, I learned SQL at school in 2010 (I think that revealed my age) and OracleDB 10G was the book for the database class. Due to my job responsibilities having been focusing on the middleware part, I didn’t closely work with the database layer, other than run some queries. I also missed the trend of NoSQL since I mainly worked on business applications where transaction is the focus. All I know about it is that it stores everything, like pictures and documents that do not typically have the relational structure and it is more commonly used in “big data” scenarios. But what is a NewSQL database? On Pingcap’s website, it is called “SQL at Scale”. Is that what NewSQL is?
So I did some reading. And here is a brief:
With SQL, since there is a structure, you can pivot fast, but it will take additional effort for scale - because of the structure. And to scale, usually it uses the "vertical" way - by adding additional hardware support for one single server. If we think of a server as a building, we will add additional floors of this building to increase its capacity, and this is called “scale up”. For NoSQL, you can handle the huge volume of data, but the data is unstructured. Also, it is based on a distributed system so it can store a lot of data and scale fast. This is a "horizontal" scale - by inviting additional "server friends" to do the work together. To increase the capacity, we will build additional buildings, and this is called “scale out”.
Because the underlying structure for SQL and NoSQL are so different, based on the business need, usually you will either choose a SQL Database, or a NoSQL database. That’s what we call a “trade off”. We are making all the trade offs everyday. Even when choosing databases, we need to choose between “OLTP” - transaction efficient and “OLAP” - analytical efficient. I got it.
Then the thing called NewSQL database evolves. It tells you, you don’t have to do this trade-off any more. Yes, you can keep both the fish and the bear paw now! (If you are confused, here is the reference to this Chinese idiom). Basically, it is a distributed database that does the horizontal scale, which is great for rapid growth. At the same time, you can still use SQL to work with it. So basically, if your application is running on any SQL databases, with little modification, you can directly switch to a NewSQL database that will handle the scalability for you. So you can just focus on your business and growth, but no need to worry about the database capacity.
It opens a door for application development that has to meet the challenge of rapid growth of data. I am quoting our co-founder from my previous company here: “The whole pandemic accelerated digital transformation for at least 10 years. Every company will become a tech company”. I have also seen a lot of cases like this in the past year. More restaurants become available on food delivery platforms like oordash. Bike store in New York that usually serves no more than 20 customers in a day, can serve hundreds of customers when they open their own websites, etc. More website views that could be converted to opportunities and more online orders. Those are all, data, eventually. And NewSQL seems to be the cure for the growing pain of every business under the digital transformation trend.
So this really looks like the next thing to me. And I am excited about don’t having to make a choice finally. I am also very curious to learn more about how exactly NewSQL achieves it. This will be something fun to work on and I will share that with you all along the way.
We will find another time to talk about the “HTAP” part of TiDB. But first I want to do some hands-on to test the SQL capabilities. See you in my next post!