OTC and the Analytics Framework Social Media Analytics T-45 days GTRI - Proprietary.
-
Upload
beverly-obrien -
Category
Documents
-
view
213 -
download
1
Transcript of OTC and the Analytics Framework Social Media Analytics T-45 days GTRI - Proprietary.
![Page 1: OTC and the Analytics Framework Social Media Analytics T-45 days GTRI - Proprietary.](https://reader036.fdocuments.net/reader036/viewer/2022072008/56649d755503460f94a56341/html5/thumbnails/1.jpg)
OTC and the Analytics Framework
Social Media AnalyticsT-45 days
GTRI - Proprietary
![Page 2: OTC and the Analytics Framework Social Media Analytics T-45 days GTRI - Proprietary.](https://reader036.fdocuments.net/reader036/viewer/2022072008/56649d755503460f94a56341/html5/thumbnails/2.jpg)
Yay, an IRAD that helps us do our jobs
DataLoader API
TwitterOTC
TwitterFollowers
Analytics API
HierNMF
Others?
Visualization APIHappy Hour
FB Feelings
Reach
Timeline Deliveries
Platform
Tweets/min
Other Hashtags
Top Links
And so on…
Workflow: Hourly topic models
Supporting architecture and data management
![Page 3: OTC and the Analytics Framework Social Media Analytics T-45 days GTRI - Proprietary.](https://reader036.fdocuments.net/reader036/viewer/2022072008/56649d755503460f94a56341/html5/thumbnails/3.jpg)
So here’s what we need to build
DataLoader API
TwitterOTC
TwitterFollowers
Analytics API
HierNMF
Others?
Visualization API
Happy Hour
FB Feelings
Reach
Timeline Deliveries
Platform
Tweets/min
Other Hashtags
Top Links
And so on…
Workflow: Hourly topic models
Supporting architecture and data management
![Page 4: OTC and the Analytics Framework Social Media Analytics T-45 days GTRI - Proprietary.](https://reader036.fdocuments.net/reader036/viewer/2022072008/56649d755503460f94a56341/html5/thumbnails/4.jpg)
Let’s use our words
Ingest Modules• TwitterOTC
– Define where to store tweets in mongo– Define where to store counts in MySQL– All specific counts and transformations
(sentiment and dates) are hardcoded into the ingest module (these could later become filters)
• TwitterFollowers– Define where to store followers in MySQL– GET from Twitter API with followers for user– For each hour, counts of unfollow and
follow actions
Workflow Prototype• Hourly topic models
– Every hour, create a matrix for the previous hour from the TwitterOTC source
– Once a new matrix shows up, HierNMF clustering will kick off to create new topic models
– Once new models show up, the MySQL variable storing the most recent model path will be updated
Repositories now have otc_demo branches for our use
![Page 5: OTC and the Analytics Framework Social Media Analytics T-45 days GTRI - Proprietary.](https://reader036.fdocuments.net/reader036/viewer/2022072008/56649d755503460f94a56341/html5/thumbnails/5.jpg)
Creating Visualizations
1. Create the python handler– Use the visualization interface as a template
• otc_vis/VISUALIZATION_INTERFACE.py
– All visualizations are located here:• otc_vis/• For reference: analytics-framework/visualization/python/Visualization/vis/
2. Create the directive, if you are not using the default directive• src/js/directives/VisDirectives.js
3. Test in a single page with known data– Use sample page as starting template
• src/widgets/topics.html
– Modify the createVis function to use your inputs and send the request to your newly created vis
![Page 6: OTC and the Analytics Framework Social Media Analytics T-45 days GTRI - Proprietary.](https://reader036.fdocuments.net/reader036/viewer/2022072008/56649d755503460f94a56341/html5/thumbnails/6.jpg)
Creating Ingest Modules
1. Create the python handler– Place in otc_ingest/– Use existing TwitterOTC as starting point– Register with the analytics framework:
python analytics-framework/configure.py --api ingest--mode add--filename otc_ingest.[your_python_file].py
![Page 7: OTC and the Analytics Framework Social Media Analytics T-45 days GTRI - Proprietary.](https://reader036.fdocuments.net/reader036/viewer/2022072008/56649d755503460f94a56341/html5/thumbnails/7.jpg)
Our friend, Vincent Vega
We do this in Vincent:scatter = vincent.Scatter(self.matrix, iter_idx=0) .colors(brew='Set1') .axis_titles(x=self.features[0], y='') .legend(title='Features') .to_json()
We get this in Vega:{"title": "Features", "offset": 0, "properties": {}, "fill": "color"}], "scales": [{"range": "width", "domain": {"field": "data.idx", "data": "table"}, "type": "linear", "name": "x", "zero": false}, {"range": "height", "domain": {"field": "data.val", "data": "table"}, "name": "y", "nice": true}, {"range": ["#e41a1c", "#377eb8", "#4daf4a", "#984ea3", "#ff7f00", "#ffff33", "#a65628", "#f781bf", "#999999"], "domain": {"field": "data.col", "data": "table"}, "type": "ordinal", "name": "color"}], "axes": [{"scale": "x", "type": "x", "title": "Petal length"}, {"scale": "y", "type": "y", "title": ""}], "height": 500, "padding": "auto", "width": 960, "marks": [{"type": "group", "from": {"data": "table", "transform": [{"keys": ["data.col"], "type": "facet"}]}, "marks": [{"type": "symbol", "properties": {"enter": {"y": {"field": "data.val", "scale": "y"}, "x": {"field": "data.idx", "scale": "x"}, "size": {"value": 100}, "fill": {"field": "data.col", "scale": "color"}}}}]}], "data": [{"values": [{"val": 1.4, "col": "Petal length", "idx": 0}, {"val": 0.2, "col": "Petal width", "idx": 0}, {"val": 5.1, "col": "Sepal length", "idx": 0}, {"val": 3.5, "col": "Sepal width", "idx": 0}, {"val": 1.4, "col": "Petal length", "idx": 1}, {"val": 0.2, "col": "Petal width", "idx": 1},…
![Page 8: OTC and the Analytics Framework Social Media Analytics T-45 days GTRI - Proprietary.](https://reader036.fdocuments.net/reader036/viewer/2022072008/56649d755503460f94a56341/html5/thumbnails/8.jpg)
• Various basic, predefined vis (see Scatter.py, Linechart.py…)– http://vincent.readthedocs.org/en/latest/
• Basic building block is a Chart (see ClusterScatter.py)– Chart.data is a list of the data sources (each with a unique name), which can be created from basic
python types or pandas and numpy types– Chart.scales is a list of Scale objects, which use a data source and can define x range, y range, and
colors– Chart.axes is a list of Axis objects which identify Scale objects to use for defining the axes– Chart.marks is a list of Mark objects, each of which define a particular set of data to display in a
particular way– Chart.legend defines what to call the legend (the content of the legend itself will be drawn from
Chart.marks)
• Vega is a higher-level visualization specification language on top of D3– In general, if you can do it in Vega, you can do it in Vincent via kwargs– https://github.com/trifacta/vega/wiki/Vega-and-D3 – http://trifacta.github.io/vega/
Getting to know Vincent Vega