XEI - Inception

Prior to the summer of 2023, XEI core team was entirely focused on crafting advanced quantitative trading systems tailored for the US stock and cryptocurrency exchanges. The main hurdle we faced was building a powerful infrastructure to support a high-performing backend trading system capable of handling immense computational demands.

Our trading approach, verging on the edge of high-frequency trading (HFT), required the instantaneous analysis of tick data from more than a thousand stocks and a hundred and fifty cryptocurrencies. HFT involves utilizing powerful computer algorithms to execute a large number of orders at incredibly fast speeds, relying on complex algorithms to scrutinize various markets and place orders based on the prevailing market conditions. Additionally, our system was designed to perform real-time backtesting and fine-tuning of algorithmic parameters for each trading asset, all while ensuring the capability to support trading activities for over 30,000 clients on platforms like ETrade.com, Alpaca Markets, and Binance.com, with a system response time under 200 milliseconds following market events for executing client orders.

Our game-changing moment came with the discovery of Ray.io, an open-source platform that OpenAI employs for distributing the training of GPT-3/4 across hundreds of thousands of CPUs and GPUs. This discovery was pivotal, enabling us to drastically reduce the development time of our backend infrastructure from more than half a year to under two months.

Upon integrating Ray into our backend and gearing up to launch the application across a network of GPU and CPU workers to manage our extensive computational needs, we hit a snag with the exorbitant costs associated with running such a system, primarily due to the high prices charged by GPU cloud services on a per-demand basis.

The Cost Barrier Take, for example, the NVIDIA A100 card, which cost upwards of $75 per day for each card. Our operations required the use of over 50 such cards for an average of 25 days each month, leading to a staggering monthly expense of approximately 100K USD. This financial burden was a significant challenge not just for us but also for other self-financed startups in the AI/ML sector.

Despite these high costs, the demand for computational resources for AI applications has been surging, doubling every three months and increasing tenfold every 18 months. Consequently, OpenAI found itself renting more than 300K CPUs and 10K GPUs just for training GPT-3, indicating just the tip of the iceberg in the ever-growing need of AI computational power.

Last updated

Logo

Copyright XEI LLC 2024