
- 6th Jul 2024
- 19:29 pm
To demonstrate the ability to apply different types of data visualisations to the selected data and be able to adapt these visualisations according to selected data. The visualisations and techniques used are expected to show an understanding of the nature of the data being analysed. Variety is also important as it demonstrates an ability to keep and focus the viewer’s attention.
- In respect of temporal data, simple time/date/Time Zone (TZ) information is sufficient. Fort comparing trends, it may be necessary to convert this data to a Coordinated Universal Time (UTC) value. Information on this conversion can be found on multiple sites via Google or other search engines.
- In respect of the spatial data, this can be kept at country/state level. The temporal TZ value can be used to approximate the longitude of the data source and give some indication of location. This latter method should only be used if country/state information is missing.
- Reliability scores is an evolving subject but there are “tools” in play that can assign the credibility of a feed. If this is available and usable, then a colour progression with a traffic light (red/yellow/green) visualisation could help bolster any claims made based on the data content.
Sentiment and frequency of opinions should also be considered in the visualisations. Word clouds can be used to represent the frequency of sentiment/opinion/phrases while bar charts, radar plots, spider graphs, sunbursts/treemaps (if a hierarchy can be applied to the data) can be used.
Outputs: The CMA is intended to integrate the learning from the AMLNN and DV modules.
The output from this CMA from the DV perspective is a dashboard built in D3.js, Highcharts, or (if so desired) Python. The dashboard should be laid out so the viewer is not overwhelmed but drawn into the analysis present. As with UTC, there are many websites that show various dashboard designs and it is advisable to visit these sites to see what is considered best practise currently.
Data Visualisation - Get Assignment Solution
Please note that this is a sample solution created by our Python programmers for the Data Visualisation assignment. These solutions are for research and reference only.
- Visit our Python Assignment Sample Solution page to download the complete solution, including code, report, and screenshots.
- Connect with our Python Tutors for online tutoring to help you understand and complete this assignment.
- Check out the partial solution for this assignment in the blog post below.
Free Assignment Solution - Data Visualisation Common Module
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"id": "SJ8JR8hKWws3",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "da67dd38-a41c-4dcb-e7c7-f226ddce95e9"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"[nltk_data] Downloading package stopwords to /root/nltk_data...\n",
"[nltk_data] Package stopwords is already up-to-date!\n",
"[nltk_data] Downloading package punkt to /root/nltk_data...\n",
"[nltk_data] Package punkt is already up-to-date!\n"
]
}
],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"import seaborn as sns;sns.set(style=\"white\")\n",
"import matplotlib.pyplot as plt\n",
"%matplotlib inline\n",
"\n",
"import re\n",
"from nltk.corpus import stopwords\n",
"import string\n",
"import nltk\n",
"nltk.download('stopwords')\n",
"nltk.download('punkt')\n",
"\n",
"import warnings\n",
"warnings.simplefilter(\"ignore\")"
],
"id": "SJ8JR8hKWws3"
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"id": "2A-_C1jGmiAV",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 354
},
"outputId": "658ba911-9782-4181-d832-2edb89b139ad"
},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" DateTime \\\n",
"0 2022-04-22 06:09:13+00:00 \n",
"1 2022-04-22 04:44:32+00:00 \n",
"2 2022-04-22 04:37:54+00:00 \n",
"3 2022-04-22 03:25:55+00:00 \n",
"4 2022-04-22 03:21:10+00:00 \n",
"\n",
" Text Followers \\\n",
"0 Djokovic was not my N1 favorite, but after his... 52 \n",
"1 @Surtilala24 Djokovic's issue can still be und... 7353 \n",
"2 @siddtalks Already they blew up on Djokovic va... 270 \n",
"3 Vaccine mandates & indefensible war are ch... 45 \n",
"4 @OmarssAlejandro @anna12345marko @StuYork13 @M... 7268 \n",
"\n",
" Retweet Count Likes Location Sentiment \\\n",
"0 1 1 Venezuela 1 \n",
"1 0 0 India -1 \n",
"2 0 1 India -1 \n",
"3 0 0 Canada -1 \n",
"4 0 0 USA -1 \n",
"\n",
" Cleaned_Text \n",
"0 ['n1', 'favorite', 'position', 'regarding', 'c... \n",
"1 ['issue', 'still', 'understood', 'refused', 't... \n",
"2 ['already', 'blew', 'controversy', 'allowing',... \n",
"3 ['mandate', 'indefensible', 'war', 'chalk', 'c... \n",
"4 ['nobody', 'disputing', 'reason', 'acting', '“... "
],
"text/html": [
"\n",
" <div id=\"df-c67edc5f-99f6-4d76-83cb-9faa276156ff\">\n",
" <div class=\"colab-df-container\">\n",
" <div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>DateTime</th>\n",
" <th>Text</th>\n",
" <th>Followers</th>\n",
" <th>Retweet Count</th>\n",
" <th>Likes</th>\n",
" <th>Location</th>\n",
" <th>Sentiment</th>\n",
" <th>Cleaned_Text</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2022-04-22 06:09:13+00:00</td>\n",
" <td>Djokovic was not my N1 favorite, but after his...</td>\n",
" <td>52</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>Venezuela</td>\n",
" <td>1</td>\n",
" <td>['n1', 'favorite', 'position', 'regarding', 'c...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2022-04-22 04:44:32+00:00</td>\n",
" <td>@Surtilala24 Djokovic's issue can still be und...</td>\n",
" <td>7353</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>India</td>\n",
" <td>-1</td>\n",
" <td>['issue', 'still', 'understood', 'refused', 't...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2022-04-22 04:37:54+00:00</td>\n",
" <td>@siddtalks Already they blew up on Djokovic va...</td>\n",
" <td>270</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>India</td>\n",
" <td>-1</td>\n",
" <td>['already', 'blew', 'controversy', 'allowing',...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>2022-04-22 03:25:55+00:00</td>\n",
" <td>Vaccine mandates &amp; indefensible war are ch...</td>\n",
" <td>45</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>Canada</td>\n",
" <td>-1</td>\n",
" <td>['mandate', 'indefensible', 'war', 'chalk', 'c...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2022-04-22 03:21:10+00:00</td>\n",
" <td>@OmarssAlejandro @anna12345marko @StuYork13 @M...</td>\n",
" <td>7268</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>USA</td>\n",
" <td>-1</td>\n",
" <td>['nobody', 'disputing', 'reason', 'acting', '“...</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>\n",
" <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-c67edc5f-99f6-4d76-83cb-9faa276156ff')\"\n",
" title=\"Convert this dataframe to an interactive table.\"\n",
" style=\"display:none;\">\n",
" \n",
" <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
" width=\"24px\">\n",
" <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
" <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
" </svg>\n",
" </button>\n",
" \n",
" <style>\n",
" .colab-df-container {\n",
" display:flex;\n",
" flex-wrap:wrap;\n",
" gap: 12px;\n",
" }\n",
"\n",
" .colab-df-convert {\n",
" background-color: #E8F0FE;\n",
" border: none;\n",
" border-radius: 50%;\n",
" cursor: pointer;\n",
" display: none;\n",
" fill: #1967D2;\n",
" height: 32px;\n",
" padding: 0 0 0 0;\n",
" width: 32px;\n",
" }\n",
"\n",
" .colab-df-convert:hover {\n",
" background-color: #E2EBFA;\n",
" box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
" fill: #174EA6;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert {\n",
" background-color: #3B4455;\n",
" fill: #D2E3FC;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert:hover {\n",
" background-color: #434B5C;\n",
" box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
" filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
" fill: #FFFFFF;\n",
" }\n",
" </style>\n",
"\n",
" <script>\n",
" const buttonEl =\n",
" document.querySelector('#df-c67edc5f-99f6-4d76-83cb-9faa276156ff button.colab-df-convert');\n",
" buttonEl.style.display =\n",
" google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
"\n",
" async function convertToInteractive(key) {\n",
" const element = document.querySelector('#df-c67edc5f-99f6-4d76-83cb-9faa276156ff');\n",
" const dataTable =\n",
" await google.colab.kernel.invokeFunction('convertToInteractive',\n",
" [key], {});\n",
" if (!dataTable) return;\n",
"\n",
" const docLinkHtml = 'Like what you see? Visit the ' +\n",
" '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
" + ' to learn more about interactive tables.';\n",
" element.innerHTML = '';\n",
" dataTable['output_type'] = 'display_data';\n",
" await google.colab.output.renderOutput(dataTable, element);\n",
" const docLink = document.createElement('div');\n",
" docLink.innerHTML = docLinkHtml;\n",
" element.appendChild(docLink);\n",
" }\n",
" </script>\n",
" </div>\n",
" </div>\n",
" "
]
},
"metadata": {},
"execution_count": 2
}
],
"source": [
"data = pd.read_csv(\"/content/CleanedTweets.csv\")\n",
"data = data.drop(\"Unnamed: 0\", axis=1)\n",
"data.head()"
],
"id": "2A-_C1jGmiAV"
},
{
"cell_type": "markdown",
"source": [
"> Sentiment Distribution:"
],
"metadata": {
"id": "ku3HTDeMeOyn"
},
"id": "ku3HTDeMeOyn"
},
{
"cell_type": "code",
"source": [
"data[\"Sentiment\"]= data[\"Sentiment\"].replace({-1:\"Negative\", 1:\"Positive\", 0:\"Neutral\"})\n",
"plt.figure(figsize=(15,4))\n",
"sns.countplot(y = 'Sentiment' , data = data, palette=\"Set1\")\n",
"plt.title('Sentiment Ratio')"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 318
},
"id": "o1tjQdU6eOgl",
"outputId": "a3048fa7-4a4a-4451-e6b7-9154f36ec88d"
},
"id": "o1tjQdU6eOgl",
"execution_count": 3,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"Text(0.5, 1.0, 'Sentiment Ratio')"
]
},
"metadata": {},
"execution_count": 3
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 1080x288 with 1 Axes>"
{
"cell_type": "code",
"source": [
"len_ls= []\n",
"data[\"Text\"] = data[\"Text\"].astype(\"str\")\n",
"for leng in data[\"Text\"].tolist():\n",
" len_ls.append(len(leng))\n",
"data[\"Length\"] = len_ls\n",
"\n",
"f, (ax_box, ax_hist) = plt.subplots(2, sharex=True, gridspec_kw={\"height_ratios\": (.2, .80)})\n",
"f.set_size_inches(15, 5)\n",
"sns.boxplot(data[\"Length\"], ax=ax_box)\n",
"sns.distplot(data[\"Length\"], ax=ax_hist)\n",
"ax_box.set(xlabel='')\n",
"ticks = ax_box.set_xticklabels(ax_box.get_xticklabels())\n",
"plt.title(\"Length Distribution of Tweets\")"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 346
},
"id": "Jof97cxZj3nI",
"outputId": "dfffd7c5-fb05-46dc-df4c-859624d37101"
},
"id": "Jof97cxZj3nI",
"execution_count": 6,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"Text(0.5, 1.0, 'Length Distribution of Tweets')"
]
},
"metadata": {},
"execution_count": 6
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 1080x360 with 2 Axes>"
"source": [
"stopwords = set(STOPWORDS)\n",
" \n",
"vectorizer = CountVectorizer(ngram_range=(2, 2))\n",
"bag_of_words = vectorizer.fit_transform(data[\"Cleaned_Text\"])\n",
"vectorizer.vocabulary_\n",
"sum_words = bag_of_words.sum(axis=0) \n",
"words_freq = [(word, sum_words[0, idx]) for word, idx in vectorizer.vocabulary_.items()]\n",
"words_freq =sorted(words_freq, key = lambda x: x[1], reverse=True)\n",
"\n",
"bigram_words_dict = dict(words_freq)\n",
"\n",
"\n",
"wordCloud = WordCloud(stopwords = stopwords,\n",
" background_color = 'white',\n",
" width = 800,\n",
" height = 800).generate_from_frequencies(bigram_words_dict)\n",
"\n",
"plt.figure(figsize = (8, 8))\n",
"plt.imshow(wordCloud, interpolation='bilinear')\n",
"plt.axis(\"off\")\n",
"plt.tight_layout(pad = 0)\n",
"plt.show()"
],
"id": "bOUdr2w9IkoB"
},
{
"cell_type": "markdown",
"source": [
"> Trigrams Wordcloud:"
],
"metadata": {
"id": "tlaCM0w1tdfw"
},
"id": "tlaCM0w1tdfw"
},
{
"cell_type": "code",
"source": [
"stopwords = set(STOPWORDS)\n",
" \n",
"vectorizer = CountVectorizer(ngram_range=(3, 3))\n",
"bag_of_words = vectorizer.fit_transform(data[\"Cleaned_Text\"])\n",
"vectorizer.vocabulary_\n",
"sum_words = bag_of_words.sum(axis=0) \n",
"words_freq = [(word, sum_words[0, idx]) for word, idx in vectorizer.vocabulary_.items()]\n",
"words_freq =sorted(words_freq, key = lambda x: x[1], reverse=True)\n",
"\n",
"trigram_words_dict = dict(words_freq)\n",
"\n",
"\n",
"wordCloud = WordCloud(stopwords = stopwords,\n",
" background_color = 'white',\n",
" width = 800,\n",
" height = 800).generate_from_frequencies(trigram_words_dict)\n",
"\n",
"plt.figure(figsize = (8, 8))\n",
"plt.imshow(wordCloud, interpolation='bilinear')\n",
"plt.axis(\"off\")\n",
"plt.tight_layout(pad = 0)\n",
"plt.show()\n",
"wordCloud.to_file('overall_wordcloud_trigram.jpg')"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 624
},
"id": "Vq4xeE0xtcQT",
"outputId": "2041eb4b-5384-4040-9689-522ccea8774b"
},
"id": "Vq4xeE0xtcQT",
"execution_count": 12,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 576x576 with 1 Axes>"
"metadata": {
"needs_background": "light"
}
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"<wordcloud.wordcloud.WordCloud at 0x7f6627446610>"
]
},
"metadata": {},
"execution_count": 12
}
]
},
{
"cell_type": "code",
"source": [
""
],
"metadata": {
"id": "uNANPXcLt--G"
},
"id": "uNANPXcLt--G",
"execution_count": 12,
"outputs": []
}
],
"metadata": {
"colab": {
"collapsed_sections": [],
"name": "visual.ipynb",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Get the best Data Visualisation Common Module assignment and tutoring services from our experts now!
About The Author - Jane Doe
Jane Doe is an expert in data analysis and visualization, with extensive experience in creating insightful, user-friendly dashboards. She has a strong background in temporal and spatial data management, making her adept at handling complex datasets and converting them into meaningful visual narratives.