Improve logging of DB message insertion #1893

H3rnand3zzz · 2023-10-08T22:15:10Z

Addressing unreliable message history reports where some messages were written in the file log but not in the DB. This issue was not consistently reproducible, but investigations revealed potential causes.

SQLite file access issues were ruled out, as they would trigger errors in the error log.
The problem could be due to the WHERE clause in the insert query, preventing message insertion.

To resolve this, I propose:

Removing the WHERE clause from the insert query, allowing messages with duplicate IDs or stanza IDs to be inserted.
Implementing a warning log for duplicate stanza-ids or archive-ids for further investigations.

I also am planning to offer an optional setting in the future to handle duplicate IDs or stanza IDs differently, to resolve potential duplicate messages issues which might happen theoretically because of MAM. But the exact shape of the option depends on many things. Anyway, I see duplicate messages as a lesser evil comparing to unreliable history.

This is a prototype solution and will require feedback and refinement.

This change aims to improve message history reliability, especially for users who rely on logs for critical information.

The solution is based on the way that other clients handle such cases (by ignoring duplication of stanza-id).
https://dev.gajim.org/gajim/gajim/-/blob/master/gajim/common/modules/message.py?ref_type=heads#L224

I ask @jubalh to provide more context from MUC (since I lost MUC history).

jubalh · 2023-10-09T06:53:18Z

We had some discussions about this via PM and in the Profanity MUC. I'll try to add some more context.

Addressing unreliable message history reports where some messages were written in the file log but not in the DB.

The "file log", what we usually call text chat log files, is just a dump of everything we receive/send. There is no other logic there. It was our legacy system before we had the SQL database. We considered removing it but then decided that some people will like to have those chat logs for easy grepping of conversations.

The problem could be due to the WHERE clause in the insert query, preventing message insertion.

So the assumption I made via PM was that we either have something not right with NULL checking and then will silently not insert the message to the DB. Or that our WHERE logic might not be correct. We had an issue in the past: c5b370b.

I also am planning to offer an optional setting in the future to handle duplicate IDs or stanza IDs differently, to resolve potential duplicate messages issues which might happen theoretically because of MAM.

I'm not exactly sure what you mean here. Right now MAM doesn't insert messages twice.

The solution is based on the way that other clients handle such cases (by ignoring duplication of stanza-id).

I'm surprised they just insert. It is a long time ago but I believe that I either checked Gajim/Dino code when I implemented this or talked with one of the devs about it.

Now about why we had this check in the first place:
For some things like: XEP-0308: Last Message Correction, XEP-0444: Message Reactions we need the ID to reference the message.
Example:

message id = 123 text = hullo
correct message with id 123 to hello

If we have the same ID several times in the DB how do we know which one to correct? How do other clients do this? Just support these features if the ID is only present ONCE and otherwise not offer the feature for this message? Instead of aborting to insert duplicate IDs just give them a dummy ID and thus also not allow correction (and other features) on those?

One thing that we should do for sure is: if we check, then we shouldn't just check the ID but also the sender.
Also different IDs are at play here. Some are generated by the server others by the client. It is true that somehow we need to deal with clients that don't follow the spec exactly without breaking things for our users. If we trust the server to generate unique IDs then this would also be another thing.

What we should do, and the questions we need to answer:

maybe create an overview of the various IDs that exist (origin-id (XEP-0359) , stanza-id (XEP-0359), stanza id (in RFC6121), replace id (XEP-0308).. are there more?
which of the IDs do we need to save?
which of them need to be unique?
find out what other clients do (not just IF they insert or not, but also HOW they then solve the cases when they want to use a feature (correction for example) with a duplicate ID

About this PR: Refactor message insertion query and duplicate check I think refactor is not quite right. To me that would mean that the functionality stays the same :)

H3rnand3zzz · 2023-10-10T07:56:24Z

Now noticable behavioural change (letting non-unique stanzas) comes with non-default setting, so it can be a question of preference. But duplicate stanzas are logged for everyone, so we can have more diagnostic data.

jubalh · 2023-10-10T16:44:25Z

I talked with someone about this.
Let's do it like this:
Save all messages even with duplicate IDs in DB.
If the server generated stanza-id from XEP-0359: Unique and Stable Stanza IDs is not unique then we use log_warn and print it to console. About the rest (client generated IDs) we don't warn we just add them to the DB.

With things like Last Message Correction we just use the newest/latest message with that ID. (And maybe try to find out what other clients do here).

Would you mind changing the PR to this behaviour?

jubalh · 2023-10-10T16:50:13Z

BTW keep in mind to check src/xmpp/xmpp.h in the ProfMessage you can see what IDs we save there and their XEP. And also the name of the DB entry (in case of archive id).

H3rnand3zzz · 2023-10-11T09:39:32Z

If the server generated stanza-id from XEP-0359: Unique and Stable Stanza IDs is not unique then we use log_warn and print it to console.

It might lead to flood in the console in case if, for example, server sets all stanza-ids to "0". We can just warn, I think that it would be a preferable behaviour.

With things like Last Message Correction we just use the newest/latest message with that ID. (And maybe try to find out what other clients do here).

That can be done

Would you mind changing the PR to this behaviour?

Surely, but you didn't mention whether setting is still needed. I think I can change default and keep it, just in case if someone has a problem with duplicate messages.

weiss · 2023-10-11T09:55:56Z

if, for example, server sets all stanza-ids to "0".

That would be a critical server-side bug, breaking history/device sync and whatnot, comparable with, dunno, dropping stanzas. I think a client should complain as loud as possible in that case.

jubalh · 2023-10-11T20:21:49Z

Surely, but you didn't mention whether setting is still needed

Not needed.

H3rnand3zzz · 2023-10-12T07:36:06Z

@jubalh can we discuss some changes in a chat?

jubalh · 2023-10-12T20:41:13Z

@jubalh can we discuss some changes in a chat?

I'm on vacation and only check GitHub sporadically.
Other people monitor this PR too to provide guidance. But probably most things are already mentioned I believe?

H3rnand3zzz · 2023-10-13T07:14:29Z

The problem is about replacement. As it is right now quite complicated SQL query:

auto_sqlite gchar* query = sqlite3_mprintf("SELECT * FROM (SELECT COALESCE(B.`message`, A.`message`) AS message, A.`timestamp`, A.`from_jid`, A.`type`, A.`encryption` from `ChatLogs` AS A LEFT JOIN `ChatLogs` AS B ON A.`stanza_id` = B.`replace_id` WHERE A.`replace_id` = '' AND ((A.`from_jid` = '%q' AND A.`to_jid` = '%q') OR (A.`from_jid` = '%q' AND A.`to_jid` = '%q')) AND A.`timestamp` < '%q' AND (%Q IS NULL OR A.`timestamp` > %Q) ORDER BY A.`timestamp` %s LIMIT %d) ORDER BY `timestamp` %s;", contact_barejid, myjid->barejid, myjid->barejid, contact_barejid, end_date_fmt, start_time, start_time, sort1, MESSAGES_TO_RETRIEVE, sort2);

Even right now it has bugs: if you replace the same message twice or more, it will replace it visually, but after loading history, it will show only the first correction.

What we want to do is barely within a scope of an SQL query (at least not in a way it's implemented right now), as we need to select all the replacement and apply them only in the first case. I see a few ways to implement it, but none of them are to my liking.

As an alternative solution, we can use "replaces_id", which would be linked to the DB incremental ID or limit replacement capabilities to only the most recent messages.

All the proposed changes would either add complexity to the query and make it slower, either require further development, changes in DB etc.

My proposal is to add diagnostic part for now, possibly with removing the setting and after further studying the problem, we can understand its underlying causes and fix directly them, rather than relying on major changes to potentially fix a problem.

Added logging, uniqueness checking logic remains the same. For details see profanity-im#1893 discussion

jubalh · 2023-10-14T09:21:49Z

Even right now it has bugs: if you replace the same message twice or more, it will replace it visually, but after loading history, it will show only the first correction.

Great find! I propose to solve this in another PR since, as I understand it, it's not related to DB but to presentation.

Please improve commit message a bit. We are not refactoring anything as far as I can see :)
PR title also needs to be adapted. It's good to point to the PR to see the full discussion and how we arrived at this solution. But the solution should still be described a bit more in the commit message.

I would also mention why we don't want those duplicate IDs (in the commit message) and maybe even a short commit above the check in the code (something like the stanza-id should be unique according to XEP...) at least when coming from the same server..

I didn't check the SQL code deeply (yet) but code looks generally fine I think. duplicate_exists I might have added some info that it's about stanza-id or archive_id or something.. But not necessary.

src/database.c

H3rnand3zzz · 2023-10-14T10:33:12Z

@jubalh , please, contact me ASAP, I have important findings.

jubalh · 2023-10-14T10:45:29Z

@jubalh , please, contact me ASAP, I have important findings.

As mentioned I'm on vacation. I now specifically went to a place where I can access my chat client.. And I see that you haven't written anything to me and even didn't answer to my last messages there..

H3rnand3zzz · 2023-10-14T11:05:10Z

@jubalh , please, contact me ASAP, I have important findings.

As mentioned I'm on vacation. I now specifically went to a place where I can access my chat client.. And I see that you haven't written anything to me and even didn't answer to my last messages there..

Sorry, missed the notifications. Anyways, I answered and we discussed the issue and came to a conclusion.

Add logging for cases when the message is not being inserted in the DB due to its ID. Make diagnostic of unreliable message history much easier, lay the groundwork for a fully reliable message history. Further changes might include insertion of messages with non-unique stanza-id, but as it's a big change with plentiful issues, decision was made to further investigate potential causes of history unreliability. For details see profanity-im#1893 discussion

jubalh · 2023-10-14T22:40:36Z

Originally I believed we will change the behaviour of the db logging in this PR. But I guess just adding some logging is ok as well.

@jubalh

When we received a message correction via `XEP-0308: Last Message Correction` we accepted the change without checking the sender making it possible for anybody to replace the message if the ID was known. This change has been proposed by @jubalh profanity-im#1893 (comment)

@jubalh

When we received a message correction via `XEP-0308: Last Message Correction` we accepted the change without checking the sender making it possible for anybody to replace the message if the ID was known. This change has been proposed by @jubalh profanity-im#1893 (comment)

Please, backup your DB before performing any testing with changes in code. Introduce new DB structure and DB migration mechanism. The structure despite being unintuitive allows better maintainability, **performance** and data integrity. The new way is to replace original message, while keeping the original in LMC messages, this way we can preserve full edit history and avoid wasteful JOINs on every scroll up. Change the way LMC messages are being displayed. Now we check if we can replace a message from current buffer. If we don't have a message in the buffer, it might've been lost, but we can still display it as a new message. Index `timestamp` column as well. Performance boost is noticeable. Further information available here: profanity-im#1893 profanity-im#1899 profanity-im#1902

Please, backup your DB before performing any testing with changes in code. Introduce new DB structure and DB migration mechanism. The structure despite being unintuitive allows better maintainability, **performance** and data integrity. The new way is to replace original message, while keeping the original in LMC messages, this way we can preserve full edit history and avoid wasteful JOINs on every scroll up. Change the way LMC messages are being displayed. Now we check if we can replace a message from current buffer. If we don't have a message in the buffer, it might've been lost, but we can still display it as a new message. Index `timestamp` column as well. Further information available here: profanity-im#1893 profanity-im#1899 profanity-im#1902

Please, backup your DB before performing any testing with changes in code. Introduce new DB structure and DB migration mechanism. The structure despite being unintuitive allows better maintainability, **performance** and data integrity. The new way is to replace original message, while keeping the original in LMC messages, this way we can preserve full edit history and avoid wasteful JOINs on every scroll up. Change the way LMC messages are being displayed. Now we check if we can replace a message from current buffer. If we don't have a message in the buffer, it might've been lost, but we can still display it as a new message. Write messages with duplicate stanza IDs in the DB and allow LMC for them. Index `timestamp` column as well. Further details are available here: profanity-im#1893 profanity-im#1899 profanity-im#1902

Please, backup your DB before performing any testing with changes in code. Introduce new DB structure and DB migration mechanism. Now message LMC messages are interconnected with original messages, this way we have fast access to last (hence correct) applicable edits, as well as reference to the original message from the any edit (in case of chained edits). Change the way LMC messages are being displayed. Now we check if we can replace a message from current buffer. If we don't have a message in the buffer, it might've been lost, but we can still display it as a new message. Index `timestamp`, `to_jid`, `from_jid` columns to improve performance. Further information available here: profanity-im#1893 profanity-im#1899 profanity-im#1902

**Please, backup your DB before performing any testing.** Introduce new DB structure and DB migration mechanism. Now message LMC messages are interconnected with original messages, this way we have fast access to last (hence correct) applicable edits, as well as reference to the original message from the any edit (in case of chained edits). Change the way LMC messages are being displayed. Now we check if we can replace a message from current buffer. If we don't have a message in the buffer, it might've been lost, but we can still display it as a new message. Index `timestamp`, `to_jid`, `from_jid` columns to improve performance. Further information available here: profanity-im#1893 profanity-im#1899 profanity-im#1902

**Please, backup your DB before performing any testing.** Introduce new DB structure and DB migration mechanism. Index `timestamp`, `to_jid`, `from_jid` columns to improve performance. Add trigger for `replaced_by_db_id` calculation by DB on message insert. Now LMC messages are interconnected with original messages, this way we have fast access to last (hence correct) applicable edits, as well as reference to the original message from the any edit (in case of chained edits). Change the way LMC messages are being displayed. Now we check if we can replace a message from current buffer. If we don't have a message in the buffer, it might've been lost, but we can still display it as a new message. Further information available here: profanity-im#1893 profanity-im#1899 profanity-im#1902

**Please, backup your DB before performing any testing.** Introduce new DB structure and DB migration mechanism. Index `timestamp`, `to_jid`, `from_jid` columns to improve performance. Add trigger for `replaced_by_db_id` calculation by DB on message insert. Now LMC messages are interconnected with original messages, this way we have fast access to last (hence correct) applicable edits, as well as reference to the original message from any edit (in case of chained edits). Change the way LMC messages are being displayed. Now we check if we can replace a message from current buffer. If we don't have a message in the buffer, it might've been lost, but we can still display it as a new message. Further information available here: profanity-im#1893 profanity-im#1899 profanity-im#1902

jubalh assigned H3rnand3zzz Oct 9, 2023

jubalh requested a review from sjaeckel October 9, 2023 06:53

jubalh added this to the next milestone Oct 9, 2023

H3rnand3zzz force-pushed the fix/history-reliability branch 2 times, most recently from 8a7b609 to 15bd03d Compare October 9, 2023 15:13

H3rnand3zzz changed the title ~~Refactor message insertion query and duplicate check~~ Introduce /uniquestanzas command to improve reliability of DB message history Oct 9, 2023

H3rnand3zzz added a commit to H3rnand3zzz/profanity that referenced this pull request Oct 13, 2023

Refactor message insertion query and duplicate check

18ae147

Added logging, uniqueness checking logic remains the same. For details see profanity-im#1893 discussion

H3rnand3zzz force-pushed the fix/history-reliability branch from 15bd03d to 18ae147 Compare October 13, 2023 20:17

jubalh reviewed Oct 14, 2023

View reviewed changes

src/database.c Outdated Show resolved Hide resolved

H3rnand3zzz changed the title ~~Introduce /uniquestanzas command to improve reliability of DB message history~~ Improve logging of DB message insertion Oct 14, 2023

H3rnand3zzz force-pushed the fix/history-reliability branch from 18ae147 to 403f924 Compare October 14, 2023 11:29

jubalh merged commit dcab697 into profanity-im:master Oct 14, 2023
6 checks passed

jubalh mentioned this pull request Oct 17, 2023

Save all received messages to DB even if duplicate ID is used #1899

Closed

H3rnand3zzz mentioned this pull request Oct 28, 2023

Change DB structure #1902

Merged

4 tasks

H3rnand3zzz deleted the fix/history-reliability branch November 6, 2023 11:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve logging of DB message insertion #1893

Improve logging of DB message insertion #1893

H3rnand3zzz commented Oct 8, 2023

jubalh commented Oct 9, 2023 •

edited

Loading

H3rnand3zzz commented Oct 10, 2023

jubalh commented Oct 10, 2023

jubalh commented Oct 10, 2023

H3rnand3zzz commented Oct 11, 2023 •

edited

Loading

weiss commented Oct 11, 2023

jubalh commented Oct 11, 2023

H3rnand3zzz commented Oct 12, 2023

jubalh commented Oct 12, 2023

H3rnand3zzz commented Oct 13, 2023

jubalh commented Oct 14, 2023

H3rnand3zzz commented Oct 14, 2023

jubalh commented Oct 14, 2023

H3rnand3zzz commented Oct 14, 2023

jubalh commented Oct 14, 2023

Improve logging of DB message insertion #1893

Improve logging of DB message insertion #1893

Conversation

H3rnand3zzz commented Oct 8, 2023

jubalh commented Oct 9, 2023 • edited Loading

H3rnand3zzz commented Oct 10, 2023

jubalh commented Oct 10, 2023

jubalh commented Oct 10, 2023

H3rnand3zzz commented Oct 11, 2023 • edited Loading

weiss commented Oct 11, 2023

jubalh commented Oct 11, 2023

H3rnand3zzz commented Oct 12, 2023

jubalh commented Oct 12, 2023

H3rnand3zzz commented Oct 13, 2023

jubalh commented Oct 14, 2023

H3rnand3zzz commented Oct 14, 2023

jubalh commented Oct 14, 2023

H3rnand3zzz commented Oct 14, 2023

jubalh commented Oct 14, 2023

jubalh commented Oct 9, 2023 •

edited

Loading

H3rnand3zzz commented Oct 11, 2023 •

edited

Loading