The following post presents our decisions in choosing right technologies for the scalable web application project.
Abstract description of the Project
The main requirement for the project was to create a project management application with collaboration (“facebook-like”) features.
Requirements for the project:
-
Web access to the tool for everyone (not an internal application), but it should also be deployable as a standalone version for internal purposes (intranet usage).
-
Large amount of users expected (about 10 000 – 100 000 in the first few months) with moderate traffic (only few of the users are producers of data, most of the users will only view the stored data)
-
Unpredictable amounts of various data for each user.
-
Web pages with lots of dynamic content (e.g.: complex graphical components like graphs, charts… )
-
Expandable – there is a high probability that the requirements will change over time
-
Lots of customizations options per groups and users
These led to the following results:
GWT
Due to the large amount of dynamic content it was pretty clear from the beginning that the pages will contain lots of JavaScript (JS) code. Because it was obvious that sooner or later we would have to make lots of changes to the UI/front-end (FE) and back-end (BE) and we knew that maintaining pure JS code is a terrible idea we decided to use a framework – we chose GWT.
We decided to go for “pure” GWT, because we did not want to be tied to any specific GWT library/wrapper. According to user reviews and feedbacks we would have a lot of issues trying to extend components and functionality provided by any 3rd party GWT library.
For those who do not know, GWT is a framework based on java. It’s strength lies in the powerful compiler which transforms Java code to JavaScript. (http://gwt.google.com). GWT is used in major google products as Google AdWords, Google Wave, Google Groups, Google Orkut.
Pros
-
no need to pre-download anything special – works as any other web-page
-
GWT application can handle large amount of co-existing users
-
You can create highly responsive web applications with heavy lifting on the client-side and reduced communication with the server-side
-
There is minimal traffic between the client and the server. GWT client (UI) is loaded only once, only data objects are sent from backend to frontend and vice versa.
-
The initial loaded gwt client can be split into the different parts and they can be loaded on demand
-
-
The GWT compiler optimizes the generated code, removes dead code and even obfuscates the JavaScript for you all in one shot.
-
The GWT compiler generates cross-browser JavaScript code.
-
Google product with future support
Cons
-
Users need to allow execution of the JavaScript (JS) in their browser.
-
Increased time to implement complex good looking components.
-
Using of third part JS libraries is little bit tricky – you need to write custom wrapper.
-
Slow GWT compilation (development process)
-
Security – client side RIA is not secure (you can’t trust the browser data) and thus you have to implement an additional server security layer which verifies the incoming (potentially modified) traffic.
GWT – RPC vs. RequestFactory
We chose RPC, because RequestFactory is usually better suited for CRUD operations with entity proxies and because we also use custom models to minimize traffic between the client side and the backend.
(http://stackoverflow.com/questions/4119867/when-should-i-use-requestfactory-vs-gwt-rpc)
GWT bean validations
http://code.google.com/p/gwt-validation
To enable and perform validations on entities, we use Gwt validation. Pretty much a basic library that enables support for standard JSR 303 validation (@NotNull, @Size(min=3, max=5), etc…) There is also a built-in GWT support, but we wanted to use the same validation on the FE as well as BE and this library easily handles both sides with the same implementation without much boilerplate code/configuration.
Model View Presenter
We had multiple choices between various software architecture patterns but we decided to to use MVP. One of the main reasons was the fact that Google also encourages the use of MVP pattern in combination with GWT (https://developers.google.com/web-toolkit/articles/mvp-architecture). MVP is a user interface design pattern engineered to facilitate automated unit testing and improve the separation of concerns in presentation logic.
Application server
The obvious choice would be to use GWT with GAE (http://developers.google.com/appengine/), but due to the requirements we needed the application to be deployable as a standalone version too (GAE is a Google solution which is tightly bound to Google online services and thus it’s not easily portable to a local machine). As result of this we decided to go with OpenShift (OS).
OpenShift
OS is Red Hat’s open source platform as a service (PaaS), an alternative to other cloud solutions (GAE, Amazon EC2…). OS will scale the deployed web application and database tiers so you don’t have to. OS supports both JBoss AS and JBoss EAP. Therefore we can use JBoss as an application server for standalone installations. Additionally OS provides it’s services free of charge for development purposes.
So far OS do not provide custom SSL certificate, but they plan to include SNI based SSL at the megashift tier. (https://openshift.redhat.com/community/faq/how-do-i-get-ssl-for-my-domains)
Database
MongoDB
We decided to use a NoSQL Database solution – MongoDB instead of the classic RDBMS (e.g. Oracle, MsSQL…) because of its horizontal scalability (http://en.wikipedia.org/wiki/Scalability#Scale_horizontally_vs._vertically). OS supports MongoDB out of the box – another reason why we chose OpenShift.
Note, MongoDB has no support for transactions, also no strict structures are used.
Morphia
http://code.google.com/p/morphia
MongoDB (as well as most other NoSQL DBs) is by default schemaless (it do not define an exact data structure). This may be a strong feature but also raises many problems in in the enterprise development (e.g. how to map Java objects to a schemaless DB, how to keep the data consistent). So we chose to use the Morphia library (http://code.google.com/p/morphia/) which provides an abstract layer between the non-relational MongoDB and the object world of java.
There is a disadvantage of using this library – there has not been any official release of Morphia since 2011. We are aware of this and for our purposes we can go with this old version. (also there are rumors that the Morphia functionality should be soon integrated into the MongoDB driver itself) + a different branch of the original Morphia libs is still kept alive (https://github.com/jmkgreen/morphia). Although there were some issues integrating Mongo/Morphia into the GWT solution mostly caused by the non-standard ObjectId used by the entities in the combination with the partially limited GWT Java support, but after these were solved, it all works nicely together.
Note: we also experimented with JPA/JDO persistence providers (DataNucleus, EclipseLink…), but these either do not support NoSQL functions at all or are VERY limited. (the whole NoSQL approach is completely different from standard RDBMS approach and thus trying to handle SQL as well as NoSQL DBs the same way without sacrificing functionality is practically impossible… One day there may be some kind of common API/Interface to handle NoSQL DBs, but that is doubtful – even the different NoSQL DB types vary very much… Key-Value store, Document based, Graph DB)
Development environment
Eclipse
Eclipse was clear choice, GWT provides an Eclipse-integration plugin. This plugin also contains an UIBinder addon, which enables declarative xml-based UI definition and also supports a basic drag&drop component layout builder.
Teamcity
http://www.jetbrains.com/teamcity/
It is a user-friendly continuous integration (CI) server for developers and build engineers. We use this tool for “pretested” commits. When a developer wants to push his code to the SVN repository, unit tests and integration tests are executed upon the project code and also the build is made. After these jobs are successfully done, the new changes are written to the repository. This is all done on the teamcity server so the developer do not have to run these jobs on his machine. History log is presented so competent users can view all the actions and test results. We are using teamcity plugin for eclipse to integrate it with our development environment.
Conclusion
After looking at various solutions and trying also other frameworks (see our other article about Vaadin – Building Scalable Cloud App Using GWT, JBoss, MongoDB on OpenShift Part 2 (Vaadin tests)), we have decided that this architecture will serve best for our purposes, in spite of the limitations (the time needed to develop the good looking UI components as the main risk).
Thanks to Ladislav Huszar and Peter Bozik for putting this post together!