Hey all! Firstly a huge thanks in advance to anyone who spends time responding to this.

So I’m working on my MVP which I’m about to launch (in its simplest form this is an AI based news aggregator)

To date my server set up has been:

  1. data storage, scraping, and app API calls to my digital ocean server. This is a 2GB memory, 1 AMD vCPU 50 GB disk server running LAMP on Ubuntu 20.04

  2. All my AI LLM work where I preprocess and clean text, locally run LLMs from hugging face is done through a Scaleway PLAY2-PICO instance.

A few issues I’m facing:

  1. The api calls to the digital ocean server are incredibly slow. Takes 5 seconds to load posts and I’m the only one using the app.

  2. The scaleway server processes for LLMs just get killed I assume due to memory issues or whatever it is.

So now to the question. What is the server architecture / providers you guys use? It needs to be able to deal with large data tables in MYSQL quickly as well as run large LLM models as well (the two don’t need to be the same set up)

Much appreciated!

  • tony-berg@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    11 months ago

    Speaking as a techie having dealt with clients with that problem: You need a techie to help you.

    Your whole concept could be flawed to the point where you will find it prohibitively expensive to create the user experience that you’re going for, or there might be a switch to flip.

    And the reason that I’m saying that you need help from someone to figure this out is that if you had the skills to find the underlying cause you wouldn’t ask what you just asked; your question would be a lot more specific, and it would be in a more relevant tech subreddit, or you’d right now be reading relevant posts on stackoverflow.

    You really need to dig deeper into what’s causing that delay, and you need to try things like emulating 100+ concurrent users to see what happens with that delay. And you need that not only to figure out how to minimize the delay, but also to calculate what resources (CPU, GPU, RAM, network, and so on) will costs you how much on different platforms; so that you can get your numbers right to figure out if your business plan is realistic or not. Can you afford to operate your business? Is your target market willing to pay what you must charge? How much capital do you need to invest in your own hardware (or perhaps operate at a loss on a cloud provider to get early scaleability)? Do you need to rework your software to take advantage of doing some calculations on off-peak hours?