GSoC 2017 Final Report
Introduction
The last week of GSoC is approaching. We have managed to achieve a lot, but there is still room for improvements and new technologies that need a try. In this blog post, I will quickly summarize what we have done so far, what I particularly have and what can be improved/explored more.
What was done
Overall, the biggest new is that we completely managed to remove JMS and use Apache Kafka instead of it. That was our main goal when we started the project. Goal set, goal met If a push message is sent via UPS REST API, a Kafka producer handles it and adds it to a topic. A Kafka stream is initialized based on this topic. It processes the push messages to 6 different output topics based on its variant type (Android, iOS, Windows and etc.). These messages go through further complicated processing which includes different Kafka consumers and producers until final push message sending is triggered.
But not only push sending part can move to Kafka, collecting of push metrics is also meant to be implemented with Kafka. We have already started by having
- a topic which stores invalid tokens and a consumer that reads from it and deletes them
- a topic that stores if a push message was successfully sent
- a topic that stores if a push message for specific iOS token was accepted by APNS
As a continuation of GSoC, more metrics using Apache Kafka can be collected. Hopefully, one day - sooner or later, our GSoC Kafka POC will be part of the master branch.
UnifiedPush Server pull requests
- AGPUSH-2102 First Kafka Consumer usage in UPS - added a consumer which reads from “installationMetrics” topic and updates metrics #853
- AGPUSH-2105 Research Dependency Inject for Kafka Consumers and Producers
- AGPUSH-2120 Add Kafka CDI library dependency #855
- AGPUSH-2152 Add sonarqube properties file #865
- AGPUSH-2109 Research Kafka Security AGPUSH-2144, AGPUSH-2143, AGPUSH-2142 and AGPUSH-2141
- AGPUSH-2154 Add instructions to Readme how to run Jacoco. #866
- AGPUSH-2125 Add Installation Metrics Consumer injection #876
- AGPUSH-2178 Remove @KafkaConfig annotation in the jaxrs module #888
- AGPUSH-2176 Test environment setup for Kafka consumers/producers #895
- AGPUSH-2159 Create Push Notification Sender Kafka Producer #896
- Remove *.orig files #883
- AGPUSH-2155 Check id before parsing the page size #867
- AGPUSH-2190 Remove Kafka module commit (decided not to be merged)
- AGPUSH-2167 NotificationSenderCallback onSuccess/onError Topic #903
- AGPUSH-2186 Invalid Token Topic #904
- AGPUSH-2202 Update ReadMe after integration of Kafka #913
- AGPUSH-2200 Generic pushMessage “success/failure” count with Kafka Streams [In Progress]
Kafka CDI library pull requests
- Fix typos in the readme #11
- Adding shutdown method to a consumer #12
- Adding myself as a developer #24
Problems
- AGPUSH-2169 Producer null when @Producer detected before @KafkaConfig
- AGPUSH-2174 Scanning in Kafka Module doesn’t work
- AGPUSH-2115 Resolve ‘auto.commit.interval.ms’ warning for Kafka producer configurations
- AGPUSH-2139 llegalArgumentException when using Kafka CDI library
- AGPUSH-2140 InstanceAlreadyExistsException for a Kafka consumer during redeployment
- AGPUSH-2145 ComponentIsStoppedException after redeployment
Improvements
- AGPUSH-2156 Rename HttpRequestUtil.extractSortingQueryParamValue method
- AGPUSH-2158 Update comments of findAllForPushApplication method
- AGPUSH-2161 Duplication of code in variantEndpoint classes
- AGPUSH-2162 Update link comment for “Message Format Specification”
- AGPUSH-2128 Return response even if an installation metrics producer hasn’t finished
- AGPUSH-2151 MPNSPushNotificationSender - Use isEmpty() to check whether the collection is empty or not
- AGPUSH-2185 Replace String.format in the log statements
- AGPUSH-2195 Explore AdmPushNotificationSender errors per token
- AGPUSH-2196 Catch NetworkIOException when sending push messages to Windows
Other assignees’ tasks
- AGPUSH-2163 Use Kafka Streams for processing of push messages
- AGPUSH-2164 Commit template not working
- AGPUSH-2165 TokenLoader replacement with Kafka
- AGPUSH-2166 Create consumer that will replace sendMessagesToPushNetwork method
- AGPUSH-2114 Set up test environment for Kafka
- AGPUSH-2131 Improve @KafkaConfig annotation
- AGPUSH-2188 IOS Response Kafka Topic
- AGPUSH-2189 Kafka Consumer that deletes invalid tokens
- AGPUSH-2197 Check if wildfly full profile is needed
- AGPUSH-2200 Generic pushMessage “success/failure” count with Kafka Streams
Overall Stats
- We have 88 Jira Tasks created. 25 assigned to me, 25 to Dimitra and 11 to Matthias.
- 49 of Jira Tasks were created by me, 23 by Dimitra and 13 by Matthias.
- We did 37 pull requests to the UPS, 14 - mine, and 23 - Dimitra.
- The GSoC branch is 60 commits ahead of the master
- We have 7 pull requests to Kafka CDI, 3 - mine, 4 - Dimitra’s
TODO List
There are three major topics that we started but there are still things which can be added or improved. One of them is a stress/performance test of UPS after we replaced JMS with Kafka. In other words how the usage of Kafka improves the throughput. The second is the collection of push metrics - we did some initial tasks but we believe that more is yet to come. And the last one is Kafka Security. Its research was done and concrete Jira tasks were created. Though it was not the highest priority for us, there is no way GSoC POC project to be merged to master and used if security is not implemented.
Jira epics for these topics can be found here:
- AGPUSH-2157 Kafka performance metrics
- AGPUSH-2111 Push messages metrics
- AGPUSH-2109 Kafka Security
Future Work
Initially, when I applied for GSoC, integration of HBase was part of the plan. Unfortunately, we did not have the time to do it. HBase is the Hadoop database, a distributed, scalable, big data store. It provides linear and modular scalability; strictly consistent reads and writes (no “eventually consistent”); automatic and configurable sharding of tables; automatic failover support between RegionServers and easy-to-use Java API for client access. HBase can improve read/write access in the UPS, so we will leave it as an idea for GSoC Aerogear UnifiedPush Server 2018
Useful Links
- GSoc 2017 Branch and commits
- All UPS PRs and Kafka CDI PRs by me
- GSoC 2017 Jira board
- All mailing list updates: 1, 2, 3, 4, 5, 6, 7, 8 and 9
- All my blog posts 1, 2, 3 and 4
- All Jira Tasks created by me
- All Jira Tasks assigned to me
Share your feedback and stay tuned for my next post.
Do not overthink, be happy and just keep smiling…
Polina