You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We could attempt to do online self-training from the user queries that are submitted to the system if the confidence in the label was above a certain threshold (like 80%).
We should definitely specifically log any queries used for training so that the classifier can be brought back to the same point should the server be restarted. In addition to logging the document and label to an automatically-labeled-set file, we could actually update the classifier in real-time as well.
We should probably first check that the combined text-label pair is unique, because I don't think we want to allow the classifier to be biased from learning from the same instance more than once. In the case that the text is the same but the labels are different, we might want to flag that for review.
This would only work if the classifier was centralized, because of it we had multiple instances loaded (to increase performance for the number of requests handled per second) then each classifier would be learning from different novel data.
The text was updated successfully, but these errors were encountered:
We could attempt to do online self-training from the user queries that are submitted to the system if the confidence in the label was above a certain threshold (like 80%).
We should definitely specifically log any queries used for training so that the classifier can be brought back to the same point should the server be restarted. In addition to logging the document and label to an automatically-labeled-set file, we could actually update the classifier in real-time as well.
We should probably first check that the combined text-label pair is unique, because I don't think we want to allow the classifier to be biased from learning from the same instance more than once. In the case that the text is the same but the labels are different, we might want to flag that for review.
This would only work if the classifier was centralized, because of it we had multiple instances loaded (to increase performance for the number of requests handled per second) then each classifier would be learning from different novel data.
The text was updated successfully, but these errors were encountered: