Computing Resources Analysis for Event Detection system

In a previous post, we discussed how we implemented our initial version of event detection. We encourage you to go through that post, if you haven’t before, before continuing on reading this post.

As the number of the clusters increases over time with the developing news stream, the number of the similarity measurements increases too, which is reflected in the clustering time of an incoming article, alongside the whole consumed computing resources.

In this post, we’re investigating different configurations for the process of decommissioning the outdated events, that aims to limit the consumed resources by the event detection system over time.

Resources Analysis

The following analysis was done using Almeta’s database snapshot for September and October, 2019, and only for articles tagged as political news.

For each decommissioning configuration we generated four graphs:

For all the graphs, evolution means the clustering process evolution, which is measured by the number of processed articles at a point of time.

  • The execution time evolution: the computing time needed for clustering a new incoming article in seconds. 
  • The clusters count evolution: the count of the clusters (events) that were generated and preserved in the model until a point of time.
  • The memory usage evolution: the maximum memory in MB consumed by the algorithm to cluster an incoming article. 
  • The model size evolution: the size of the model as a pickle file in MB.

The Current Decommissioning Method

The method proposed in …

Conclusions:

  • The method succeeded in keeping the clusters count nearly stable. Which in turn keeps the resources consuming nearly stable too.
  • Although the execution time has many peaks, it’s nearly stable at around 0.5 sec in general.
  • The memory usage kept perfectly stable at 330 MB for a long period.
  • The model pickling size kept increasing.
  • The maximum execution time ~ 2.5 s.
  • The maximum memory usage ~ 330 MB.
  • The model size reached 3.5 MB.
  • The maximum active clusters count ~ 250.

New Configuration (1):

In the previous method, we preserved all the events at age less than 3 days, no matter what their size is. Trying to improve the previous process, by employing more strict constraints, we tried to decommission all the events that were added the last day while still having only one article.

Conclusions:

  • The method succeeded in keeping the clusters count more stable than the first method.
  • Although the execution time has many peaks, it’s nearly stable at around less than 0.5 sec in general.
  • The memory usage kept perfectly stable for a long time on a less value than the previous method which is 260 MB.
  • The model pickling size kept increasing.
  • The maximum execution time ~ 2.0 S
  • The maximum memory usage ~ 260 MB
  • The model size reached 3.5 MB
  • The maximum active clusters count ~ 120 clusters

New Configuration (2):

In addition to the configuration (1), we decommission all the events that were added before the last day while still having only three articles or less.

Conclusions:

  • The method succeeded in keeping the clusters count more stable than the first and second method.
  • Although the execution time has many peaks, it’s nearly stable at around less than 0.5 sec in general.
  • The memory usage kept perfectly stable for a long time at around 256 MB.
  • The model pickling size seems to start being stable.
  • The maximum execution time ~ 2.0 sec
  • The maximum memory usage ~ 256 MB
  • The model size reached 3.0 MB
  • The maximum active clusters count ~ 100 clusters

In Summary


Stable clusters count Stable exec timeStable memory usageStable model sizeAverage Exec time (sec)Max Execution time (sec)Max memory usage (MB)Max model size (MB)Max active clusters
Current ConfigYESYESYESNO0.7 +/- 0.22.53303.5250
New Config (1)YESYESYESNO0.4 +/- 0.122603.5120
New Config (2)YESYESYESAlmost YES0.4 +/- 0.222563.0100

Do you know that we use all this and other AI technologies in our app? Look at what you’re reading now applied in action. Try our Almeta News app. You can download it from google play: https://play.google.com/store/apps/details?id=io.almeta.almetanewsapp&hl=ar_AR

Leave a Reply

Your email address will not be published. Required fields are marked *