ActionGenerator, Part One

In this post we’ll introduce you to ActionGenerator, one of several open source projects we are working on. ActionGenerator lets you generate actions (you can also think of actions as events) from an action sources and play those actions with ActionGenerator’s action player to one of the sinks. The rest is done by ActionGenerator. ActionGenerator comes with several action sources and sinks, but one can easily implement custom action sources and sinks and play them with ActionGenerator. Let’s dig into the details.

This is the first part of the two-part post series where we show what ActionGenerator is, how you can use it for your needs, how you can extend it, and finally what existing action generators are there for you to use out-of-the-box.

What is ActionGenerator?

ActionGenerator is focused on generating actions (aka events) of your choice. Imagine you want to feed your search engine with millions of documents or you want to run stress test and see if your application can work under load for hours. That’s exactly where you can use ActionGenerator. If existing sources and sinks don’t fit your needs all you have to do is write a simple action type, your action source, and a sink to consume those actions, and you are ready to go.  The rest, which includes playing those actions and their parallelization with multiple-threads, as well as performance metrics/stats gathering, is done by ActionGenerator itself.  You only need to worry about your actions.

Current Status

So far at Sematext we’ve written all code needed to generate data and query actions for search engines like Apache Solr, ElasticSearch and SenseiDB.  All this is included in ActionGenerator so you could use it, too. We’ll expand the number of sources and sinks over time based on our own needs, but if you would like to add support for other sources and sinks, please issue a pull request — contributions are always very welcome! 🙂

Project Layout

Currently, the ActionGenerator project is divided into three modules:

  • ag-player – common classes that enable implementing and running ActionGenerator for virtually any kind of actions.
  • ag-player-es – ActionGenerator implementation for ElasticSearch.
  • ag-player-solr – ActionGenerator implementation for Apache Solr.

Before talking about currently available implementations, lets see what is needed to implement your own action generator.

Implementation

The general idea is to have an extensible framework that requires you to write only the minimal amount of code before being able to feed your system with a stream of custom actions/events. For example, a stream of data to be indexed for your search engine or queries, or rows to be inserted in HBase or MySQL or Cassandra or any other database, or messages to be inserted into a queue, etc.  All you need to do is:

  1. provide an Event implementation that represents a single event you want to be generated
  2. provide Source implementation that creates your events
  3. provide Sink implementation that consumes your events

Finally, just glue the above three things in a very simple main class and your done. Let’s quickly talk about how to implement custom Events, Sources, and Sinks.

Implementing Event

In order to implement your own event you have to extend an abstract class called Event from  the com.sematext.ag.event package. This abstract Event class doesn’t have any methods you need to implement, so you’re free to have it implemented any way you want. It doesn’t get easier than that, does it!  For inspiration, have a look at the two existing Event implementations: the SimpleDataEvent is an Event implementation for events that represent a single search engine document.  This is used for implementing sinks for data indexation. The second implementation, SimpleSearchEvent, is an implementation for generating simple queries.

Implementing Source

To implement an event source, you need to implement an interface called Source from the com.sematext.ag.source package or use one of the abstract implementations of this interface, such as FiniteEventSource. There are just three methods in this interface you need to provide implementation for:

  • void init(PlayerConfig config) – method called to initialize your event source. All the initialization should go here.
  • void close() – method invoked when event source is closing (actually, when a player closes that event source). All the cleaning and closing stuff should go here.
  • Event nextEvent() – the most important method, responsible for generating your actual Event implementations. For every call you should create an event and return it.

As mentioned, there is already an implementation available – the FiniteEventSource in com.sematext.ag.source package. It’s an abstract class that enables you to implement an event source that will generate a finite number of events. If you want to see the actual source events implementation for the two Event classes mentioned earlier – SimpleDataEvent and SimpleSearchEvent, please look at DataDictionaryEventSource and SearchDictionaryPhraseEventSource.

Implementing Sink

Finally, you have to develop your Sink or use one of the provided ones. To write a custom Sink you’ ll want to extend the abstract Sink class from the com.sematext.ag.sink package. There are three methods that you will need to implement:

  • void init(PlayerConfig config) – method called to initialize your sink. All the initialization should go here.
  • void close() – method invoked when sink is closing. All the cleaning and closing stuff should go here.
  • boolean write(T event) – method that processes your event. This is the method where you should consume your event (e.g., send it to your search engine).

There is one abstract Sink extension available – AbstractHttpSink which can help you develop sinks that use HTTP protocol to sends events.

Running Your ActionGenerator

After you are done with your development all you need to do is configure the player to run your action generator. To do that you will create a PlayerConfig instance and pass it to static play method of PlayerRunner class. This is what it might look like:

PlayerConfig config = new PlayerConfig(
    PlayerRunner.PLAYER_CLASS_CONFIG_KEY, RealTimePlayer.class.getName(),
    RealTimePlayer.MIN_ACTION_DELAY_KEY, "20",
    RealTimePlayer.MAX_ACTION_DELAY_KEY, "1000",
    RealTimePlayer.TIME_TO_WORK_KEY, "5400",
    RealTimePlayer.SOURCES_THREADS_COUNT_KEY, "2",
    RealTimePlayer.SOURCES_PER_THREAD_COUNT_KEY, "3",
    PlayerRunner.SOURCE_FACTORY_CLASS_CONFIG_KEY, SimpleSourceFactory.class.getName(),
    PlayerRunner.SINK_CLASS_CONFIG_KEY, SimpleQuerySolrSink.class.getName(),
    SimpleSourceFactory.SOURCE_CLASS_CONFIG_KEY, SearchRandomNumberEventSource.class.getName(),
    SearchRandomNumberEventSource.SEARCH_FIELD_NAME_KEY, "text",
    SearchRandomNumberEventSource.MAX_EVENTS_KEY, "100000",
    SimpleQuerySolrSink.SOLR_URL_KEY, "http://localhost:8983/solr/select");
PlayerRunner.play(config);

You can see different configuration value being passed as PlayerConfig constructor arguments. First we say that we want RealTimePlayer to be used as a player for our events. Next, we have some of the configurations needed by the RealTimePlayer:

  • RealTimePlayer.MIN_ACTION_DELAY_KEY – minimum delay between actions.
  • RealTimePlayer.MAX_ACTION_DELAY_KEY – maximum delay between actions.
  • RealTimePlayer.TIME_TO_WORK_KEY – number of seconds single thread should be working.
  • RealTimePlayer.SOURCES_THREADS_COUNT_KEY – number of threads the player should work for playing actions.
  • RealTimePlayer.SOURCES_PER_THREAD_COUNT_KEY – number of sources that should be used for each thread.

Next, you see the PlayerRunner configuration specifying source factory classes and sink factory classes. All these configs you’ll need most of the time when using ActionGenerator.
Next, you can see the SimpleSourceFactory.SOURCE_CLASS_CONFIG_KEY which specifies which event source should be used. We used the event source for generating random numbers. SearchRandomNumberEventSource needs two values, one specifying search field name (SEARCH_FIELD_NAME_KEY) and the second one which specifies number of events to be generated (MAX_EVENTS_KEY). The last configuration key and value is the address of Apache Solr for the SimpleQuerySolrSinkwhich we’ll cover in the second part of this two-part series.

The above piece of code can be put in a class and in the public static void main(String[] args) method.  You can then use this class to run your action generator!

Summing Up

In this post we’ve shown how to use and create your own action generator and how to run it with the use of PlayerRunner. In the next part of this two part series we will show what existing action generators are provided already and that you can use without developing your custom sources and sinks. In addition to that, we will share some plans for the future of this small project.  If you have any questions or suggestions, please leave a comment.  If you need support for sources and sinks we haven’t implemented yet, please just open an issue.

@sematext

Leave a Reply