Server Set up

thewanitz@alien.top · 3 years ago

Server Set up

EveryThingPlay@alien.top · 3 years ago

Agreed with you, profiling is really needed to see where the bottleneck is - my suggestion is in LLM itself (because some of them are really heavy stuff), but it actually maybe just because of the wrong server settings. And yeah running the whole neural network inside of the flask app is a bad idea not only for performance, but for stability (cuz if something gets crashed in model, then it would be high risk for something to get crashed in Flask app, and whole application get stuck)