[MIRROR] Resolves is_banned_from headaches and lag (Speeds up roundstart significantly) [MDB IGNORE] (#16001)

* Resolves is_banned_from headaches and lag (Speeds up roundstart significantly) (#69376)

About The Pull Request

Just to be clear, when I refer to time here, I am not talking about cpu time. I'm talking about real time.
This doesn't significantly reduce the amount of work we do, it just removes a lot of the waiting around we need to do for db calls to finish.

Adds queuing support to sql bans, so if an ongoing ban retrieval query is active any successive ban retrieval attempts will wait for the active query to finish

This uses the number/blocking_query_timeout config option, I hope it's still valid

This system will allow us to precache ban info, in parallel (or in batches)
With this, we can avoid needing to setup all uses of is_banned_from to support parallelization or eat the cost of in-series database requests

Clients who join after initialize will now build a ban cache automatically

Those who join before init is done will be gathered by a batch query sent by a new subsystem, SSban_cache.

This means that any post initalize uses of is_banned_from are worst case by NATURE parallel (since the request is already sent, and we're just waiting for the response)

This saves a lot of headache for implementers (users) of the proc, and saves ~0.9 second from roundstart setup for each client (on /tg/station)

There's a lot of in series is_banned_from calls in there, and this nukes them. This should bring down roundstart join times significantly.

It's hard to say exactly how much, since some cases generate the ban cache at other times.
At base tho, we save about 0.9 seconds of real time per client off doing this stuff in parallel.
Why It's Good For The Game

    When I use percentages I'm speaking about cost per player

I don't like how slow roundstart feels, this kills about 66% of that. the rest is a lot of misc things. About 11% (it's actually 16%) is general mob placing which is hard to optimize. 22% is manifest generation, most of which is GetFlatIcons which REALLY do not need to be holding up the main thread of execution.

An additional 1 second is constant cost from a db query we make to tell the server we exist, which can be made async to avoid holding the proc chain.

That's it. I'm bullying someone into working on the manifest issue, so that should just leave 16% of mob placing, which is really not that bad compared to what we have now.
Changelog

cl
code: The time between the round starting and the game like, actually starting has been reduced by 66%
refactor: I've slightly changed how ban caches are generated, admins please let me know if anything goes fuckey
server: I'm using the blocking_query_timeout config. Make sure it's up to date and all.
/cl

* Resolves is_banned_from headaches and lag (Speeds up roundstart significantly)

Co-authored-by: LemonInTheDark <58055496+LemonInTheDark@users.noreply.github.com>
This commit is contained in:
SkyratBot
2022-09-02 04:11:40 +02:00
committed by GitHub
parent f6317d2c2c
commit e7230e8b4a
8 changed files with 122 additions and 5 deletions

View File

@@ -163,7 +163,8 @@
#define INIT_ORDER_PATH -50
#define INIT_ORDER_DECAY -61 //SKYRAT EDIT ADDITION
#define INIT_ORDER_EXPLOSIONS -69
#define INIT_ORDER_STATPANELS -98
#define INIT_ORDER_STATPANELS -97
#define INIT_ORDER_BAN_CACHE -98
#define INIT_ORDER_INIT_PROFILER -99 //Near the end, logs the costs of initialize
#define INIT_ORDER_CHAT -100 //Should be last to ensure chat remains smooth during init.

View File

@@ -0,0 +1,73 @@
/// Subsystem that batches a ban cache list for clients on initialize
/// This way we don't need to do ban checks in series later in the code
SUBSYSTEM_DEF(ban_cache)
/datum/controller/subsystem/ban_cache
name = "Ban Cache"
init_order = INIT_ORDER_BAN_CACHE
flags = SS_NO_FIRE
var/query_started = FALSE
/datum/controller/subsystem/ban_cache/Initialize(start_timeofday)
generate_queries()
return ..()
/// Generates ban caches for any logged in clients. This ensures the amount of in-series ban checking we have to do that actually involves sleeps is VERY low
/datum/controller/subsystem/ban_cache/proc/generate_queries()
query_started = TRUE
if(!SSdbcore.Connect())
return
var/current_time = REALTIMEOFDAY
var/list/look_for = list()
for(var/ckey in GLOB.directory)
var/client/lad = GLOB.directory[ckey]
// If they've already got a ban cached, or a request goin, don't do it
if(lad.ban_cache || lad.ban_cache_start)
continue
look_for += ckey
lad.ban_cache_start = current_time
// We're gonna try and make a query for clients
var/datum/db_query/query_batch_ban_cache = SSdbcore.NewQuery(
"SELECT ckey, role, applies_to_admins FROM [format_table_name("ban")] WHERE ckey IN (:ckeys) AND unbanned_datetime IS NULL AND (expiration_time IS NULL OR expiration_time > NOW())",
list("ckeys" = look_for.Join(","))
)
var/succeeded = query_batch_ban_cache.Execute()
for(var/ckey in look_for)
var/client/lad = GLOB.directory[ckey]
if(!lad || lad.ban_cache_start != current_time)
continue
lad.ban_cache_start = 0
if(!succeeded)
qdel(query_batch_ban_cache)
return
var/list/ckey_to_bans = list()
// Runs after the check for safety, don't want to override anything
for(var/ckey in look_for)
ckey_to_bans[ckey] = list()
while(query_batch_ban_cache.NextRow())
var/ckey = query_batch_ban_cache.item[1]
var/role = query_batch_ban_cache.item[2]
var/hits_admins = query_batch_ban_cache.item[3]
var/list/bans = ckey_to_bans[ckey]
if(!bans)
continue
// Yes I know this is slightly unoptimal, no I do not care
var/is_admin = GLOB.admin_datums[ckey] || GLOB.deadmins[ckey]
if(is_admin && !text2num(hits_admins))
continue
bans[role] = TRUE
for(var/ckey in ckey_to_bans)
var/client/lad = GLOB.directory[ckey]
if(!lad)
continue
lad.ban_cache = ckey_to_bans[ckey]
qdel(query_batch_ban_cache)

View File

@@ -276,7 +276,7 @@ SUBSYSTEM_DEF(ticker)
log_world("Game start took [(world.timeofday - init_start)/10]s")
round_start_time = world.time
SSdbcore.SetRoundStart()
INVOKE_ASYNC(SSdbcore, /datum/controller/subsystem/dbcore/proc/SetRoundStart)
to_chat(world, span_notice("<B>Welcome to [station_name()], enjoy your stay!</B>"))
alert_sound_to_playing(sound(SSstation.announcer.get_rand_welcome_sound())) //SKYRAT EDIT CHANGE

View File

@@ -10,7 +10,7 @@
return
var/client/player_client = GLOB.directory[player_ckey]
if(player_client)
var/list/ban_cache = player_client.ban_cache || build_ban_cache(player_client)
var/list/ban_cache = retrieve_ban_cache(player_client)
if(!islist(ban_cache))
return // Disconnected while building the list.
if(islist(roles))
@@ -99,12 +99,43 @@
. += list(list("id" = query_check_ban.item[1], "bantime" = query_check_ban.item[2], "round_id" = query_check_ban.item[3], "expiration_time" = query_check_ban.item[4], "duration" = query_check_ban.item[5], "applies_to_admins" = query_check_ban.item[6], "reason" = query_check_ban.item[7], "key" = query_check_ban.item[8], "ip" = query_check_ban.item[9], "computerid" = query_check_ban.item[10], "admin_key" = query_check_ban.item[11]))
qdel(query_check_ban)
/// Gets the ban cache of the passed in client
/// If the cache has not been generated, we start off a query
/// If we still have a query going for this request, we just sleep until it's recieved back
/proc/retrieve_ban_cache(client/player_client)
if(QDELETED(player_client))
return
if(player_client.ban_cache)
return player_client.ban_cache
var/config_delay = CONFIG_GET(number/blocking_query_timeout) SECONDS
// If we haven't got a query going right now, or we've timed out on the old query
if(player_client.ban_cache_start + config_delay < REALTIMEOFDAY)
return build_ban_cache(player_client)
// Ok so we've got a request going, lets start a wait cycle
// If we wait longer then config/db_ban_timeout we'll send another request
// We use timeofday here because we're talking human time
// We do NOT cache the start time because it can update, and we want it to be able to
while(player_client && player_client?.ban_cache_start + config_delay >= REALTIMEOFDAY && !player_client?.ban_cache)
// Wait a decisecond or two would ya?
// If this causes lag on client join, increase this delay. it doesn't need to be too fast since this should
// Realllly only happen near Login, and we're unlikely to make any new requests in that time
stoplag(2)
// If we have a ban cache, use it. if we've timed out, go ahead and start another query would you?
// This will update any other sleep loops, so we should only run one at a time
return player_client?.ban_cache || build_ban_cache(player_client)
/proc/build_ban_cache(client/player_client)
if(!SSdbcore.Connect())
return
if(QDELETED(player_client))
return
var/current_time = REALTIMEOFDAY
player_client.ban_cache_start = current_time
var/ckey = player_client.ckey
var/list/ban_cache = list()
var/is_admin = FALSE
@@ -120,9 +151,14 @@
"SELECT role, applies_to_admins FROM [format_table_name("ban")] WHERE ckey = :ckey AND unbanned_datetime IS NULL AND (expiration_time IS NULL OR expiration_time > NOW()) AND [server_check]", //skyrat edit
list("ckey" = ckey)
)
if(!query_build_ban_cache.warn_execute())
var/query_successful = query_build_ban_cache.warn_execute()
// After we sleep, we check if this was the most recent cache build, and if so we clear our start time
if(player_client?.ban_cache_start == current_time)
player_client.ban_cache_start = 0
if(!query_successful)
qdel(query_build_ban_cache)
return
while(query_build_ban_cache.NextRow())
if(is_admin && !text2num(query_build_ban_cache.item[2]))
continue

View File

@@ -50,6 +50,8 @@
///Used to cache this client's bans to save on DB queries
var/ban_cache = null
///If we are currently building this client's ban cache, this var stores the timeofday we started at
var/ban_cache_start = 0
///Contains the last message sent by this client - used to protect against copy-paste spamming.
var/last_message = ""
///contins a number of how many times a message identical to last_message was sent.

View File

@@ -479,6 +479,10 @@ GLOBAL_LIST_INIT(blacklisted_builds, list(
get_message_output("watchlist entry", ckey)
check_ip_intel()
validate_key_in_db()
// If we aren't already generating a ban cache, fire off a build request
// This way hopefully any users of request_ban_cache will never need to yield
if(!ban_cache_start && SSban_cache?.query_started)
INVOKE_ASYNC(GLOBAL_PROC, /proc/build_ban_cache, src)
send_resources()

View File

@@ -13,7 +13,7 @@
var/base_build_path = /obj/machinery/smartfridge
/// Maximum number of items that can be loaded into the machine
var/max_n_of_items = 1500
/// If the AI is allowed to retrive items within the machine
/// If the AI is allowed to retrieve items within the machine
var/allow_ai_retrieve = FALSE
/// List of items that the machine starts with upon spawn
var/list/initial_contents

View File

@@ -542,6 +542,7 @@
#include "code\controllers\subsystem\assets.dm"
#include "code\controllers\subsystem\atoms.dm"
#include "code\controllers\subsystem\augury.dm"
#include "code\controllers\subsystem\ban_cache.dm"
#include "code\controllers\subsystem\blackbox.dm"
#include "code\controllers\subsystem\blackmarket.dm"
#include "code\controllers\subsystem\chat.dm"