Philly Skyscrapers

When I came across Zen Master Peter Gilks post on NYC skyscrapers, I decided to do one for Philly. Philadelphia is the largest city in Commonwealth of Pennsylvania and sixth most populous city in United States. Adding the steeple to Christ Church made Philly home to the tallest structure in North America during revolution. Then comes our beloved City Hall the tallest habitable building from 1894 to 1908. It was also the first tallest secular building until then the credit went to religious building including the Pyramids of Giza. There was gentleman’s agreement not to build anything taller than the City Hall which was ended by One Liberty Place in 1986. Now Comcast Technology Center stands tall as 9th tallest building in United States. Enjoy the skyscrapers of Philly here –

Philly Skyscrapers

Philly Skyscrapers

Posted in Visualization | Tagged , , , | Leave a comment

Ethics and Sensitivity

My recent speech on ethics seemed to have touch some nerves. It was that I touched “uncomfortable” topics and very specifically –swastika, syphilis, and voter fraud commission. Suggestion was given to stick to neutral topics. I thought this post would be place where I can redirect future questions/comments/suggestions.

Before I defend myself, I just want to make one thing clear. When I talk ethics, I’m not a moral arbiter. I’m just like you regular Jane (or Joe) – a law abiding, tax paying, boring neighbor who you would just nod politely and move along unless you got some weird quirks like technology, magic realism, historical fiction, painting, home cooking, or native gardening to name a few, then we can talk all day. That said, I want to help people to be aware of complex situations, ask questions to understand better, and enable them to arrive at a decision that aligns with their own/business values. When you ask someone in position of authority or expertise to give a definitive judgement on whether a decision is right or wrong, you could be just passing your responsibility. This would lead to “confirmation bias” – I conform you and you conform me – and eventually to ethical blindness in making business or personal decisions.

For example, let us say you are writing an algorithm for self-driving cars. One use case is break failure. If the breaks fail, there are two lanes, the car could possibly run over two little kids crossing the road in one lane or an old couple following them back in another lane. Which lane would you instruct your algorithm to choose? Now consider, what if one has a terminal disease and about to die in weeks or what if the old couple are your grandparents or what if the kids are your own? What is next step in your algorithm with that extra intelligence? Is there any right or wrong answer here? The situation here involved vehicular accident. It triggers some strong emotions in me – I lost a friend in road accident. Painful memories come to me anytime when I hear or read about it. I avoid talking about it. The discussion here is ethics of self-driving algorithms for cars. Focus on the situational complexity and try to avoid adding your own personal emotions to the discussion. This is the key thing in talking ethics. It is common to think “bad people makes bad decisions” – it is expected. When “good people makes wrong decisions” the devastations are stronger and deeper. When you make a business decision it should be for greater good for a larger number of people.

Understand the Bias –

Biases are stronger in influencing decisions. One person’s decision may seem completely irrational to you but if you look at it in the context it would make a perfect sense. At the same time, if your decisions are made on rigid contextual frames and biases, that could potentially lead to bad decisions. My example of swastika is to say as an Indian swastika is a positive bias (literal translation in Sanskrit su – goodness astika – happens here), symbol of goddess of wealth. This symbol could be traced to Indus Valley Civilization circa 3000 B.C.E. The same symbol negative bias to my American husband. It is essential to understand each other’s bias to have a successful marriage or even to live under the same roof. Try reading from “Understand the Bias” again to observe I did not qualify “negative bias” or launch into that part of history. I felt it is unnecessary to delve on that. It still evoked stronger emotions of making someone uncomfortable. It is better to leave emotions out when discussing complex situations to understand our biases.

Focus on the Context –

For “Informed Consent”, the legal precedence is the Common Rule. It started with the infamous Tuskegee Study. I felt it was essential to talk about the history because doctors are good people who had taken a Hippocratic Oath – “First Do No Harm”. The issue was such good people made a wrong decision to let the black sharecroppers suffer syphilis for three decades after the cure has been found. I understand syphilis is not a pleasant topic to discuss as the brush strokes of Claude Monet to immortalize the impressions of lily pond. My sister (microbiologist) may disagree with you – she is the only weirdo I’ve known who talks about dreadful viruses with the same warmth as one would talk about their favorite food. Once I overheard a discussion during dinner with her friends on “pappy” – I was thinking they were talking about someone’s pet dog only found that to be a discussion on HPV. My sympathies are with you – gross topic. It is easy to get carried away from the context – human rights – would you let your doctor run tests on you without your consent?

No Politics –

I wanted to bring attention of ethics in data storage. Recent one is that Voter Fraud Commission of 2017 wanted to store the data in the white house. I clearly said, “leave all the politics aside” – just think is it a clever idea to store the entire country’s voting history in one single place and that too white house?”. Situation to focus on is data storage in white house. One of the requirements to gain access to federal data centers is the person must be an American born in United States. Leaders, diplomats, and even tourists from other countries visit white house every day. I’m sure there are some restricted places in the white house but there are better places in the country if you want to store such sensitive, massive information. What is the check and balance here to stop anyone in white house abusing such sensitive data? Please observe my language here – I did not venture into politics although the topic is white house. Being silent is conforming an injurious behavior or decision. Fear of talking against the power or authority could lead to wrong decisions. In the story of Emperor’s New Clothes, only a little boy had the courage (or he was too young and naïve to feel the fear for the emperor) to tell that the emperor is naked.

Less Sensitive Case Studies –

All of us have our own values – we believe in our values and we fight for it. What you may consider as “less sensitive” could possibly be the sole existential purpose of another. It could possibly be taken under a different light. I’m a woman and taking case studies only about women’s issue could be considered as “pushing my own values” or talking about the location of Google’s data center location on San Andreas fault line could be taken as “hidden agenda against Google”. Although, I know if Google goes down, the earth would turn upside down, Google is a private company. United States government have reaches furthermore than Google could ever imagine. All case studies involving humans are equally sensitive whether it reaches small or big population.

To conclude, complex situations lead to difficult decision making. When you make that decision for you or your business, it is important to have a big picture, understand the biases, not to add personal emotions to the situation, and/or add fear of authority to it. Thanks to technologies like internet of things and big data to shrinking personal space, the need to analyze every atom of lives, add survival in aggressive business environment, we are living in ever changing world. That is exactly why we need ethics to peacefully coexist, to not to step on each other’s toes, and to create a better place to leave for our future generations.

Posted in ethics | Tagged , | Leave a comment

Custom Geocoding Tips and Tricks

Tableau has geocodes in the maps database for cities with population greater than 15,000. If you are working with rural or small-town USA data, your points will not be mapped. I had some fun(?!) with custom geocoding. When you look at community support to find out lot of threads with no successful conclusions, desperate pleas for help, or even the worst no replies on that thread, your heart would just sink down to stomach. Don’t fret. Custom geocoding seems to work fine if you just follow the instructions to the “T” and take some extra effort not to add your own mess to it. Here are some tips for custom geocoding small-town USA.

    • Do not have null values in “City” or “State” column.
    • Correct all the misspelt cities and states. You’ll really save some time – NYC is not a valid city
    • If you expand the state codes to full name, for example “IL” to “Illinois” that brought from 356 unknowns to 157 unknowns for me.
    • Of course, the inevitable – you got to extend the geocodes for Tableau. Here is where the fun begins
    • I needed city and state. Maintain the hierarchy and the correct names of the columns Country (Name), State, State/Province, City, Latitude, Longitude. Any changes to this order or even an extra space between the “/” in State/Province will get you this error

“Warnings occurred while importing the custom geocoding data. Do you want to keep the imported geocoding? The file <fileName.csv> could not be used because it was missing the following levels: [Country]. Level names are case sensitive and must match exactly”

      • Do not have duplicates in your rows. Duplicates will make Tableau ignore the city altogether and throw this warning.

    “Ignored <number> invalid rows in the file <filename>.csv”

    • You got 157 unknowns and looking at the forums, you try to outsmart Tableau and custom geocode every inch of United States, even some cities very known metropolis like “Boston, MA” will vanish from your map. My guess is Tableau ignores duplicates. I would suggest sticking ONLY to your “unknowns” and not the entire United States.
    • Tableau says schema.ini is optional. Creating schema file helps to reduce the time to import the geocodes as Tableau can understand your file better.
    • Make sure you got a “CSV” file instead of “Excel”. With “Excel” file importing the geocodes was not throwing any errors but your map will have the same number of “unknowns”.
    • You can try latlong.net to find the latitude and longitude of your small-town USA.

    If you want a quick sample, my files are here. Have fun geocoding!

Posted in Visualization | Tagged , , , | Leave a comment

US Airports

United States have 172 Large airports clustered around the coasts and metropolitan cities, 686 medium airports spread among major cities, and 13,890 small airports scattered across the country. If you look at the point distribution of airports across United States, Nevada, Montana, Wyoming, and Utah has relatively lower distribution when compared to the rest of the country. Is it because of the geography or the trade-economic status of these states? Here is the visualization with embedded links to the Wikipedia pages of the airports –

United States Airports

United States Airports

Posted in Visualization | Tagged , , , | Leave a comment

Ethics in Big Data Analytics

Here are the slides from my session yesterday on Ethics in Big Data Analytics for DAMA Philadelphia.

Posted in Big Data, Conceptual, ethics, Philosophy | Tagged , , | Leave a comment

Cha-ching

As an avid numismatic, I love the United States Mint in Philadelphia. My husband knows where to buy gifts for me 😊 To celebrate the 225th Anniversary of United States Mint in the data way, here is the story in numbers. I find that the production fluctuates with the health of the economy. You can notice a pronounced dip during the great depression and the recent recession. Also, there was boost in production with the stimulus packages.

United States Mint Celebrating 225 Years

United States Mint Celebrating 225 Years

Posted in Visualization | Tagged , , , , | Leave a comment

Thou Shall Not Covet

Both my husband and my sister are huge fans of crime stories. In those rare family vacations, they like to binge watch “ID”. Last week while I was ironing his shirts, there was this crime story on TV in which a man cheats on his wife with lady number one eventually moves on to lady number two. Lady number one snaps and kills this man. I overheard my sister’s commentary on how adultery is a “sin” and not the “right thing” and that lady number one would “pay for it”. It just clicked my imaginary light bulb. When I analyze, and sometimes to an annoying extent I can argue from both sides. My argument is that lady number one never “pay for” the adulterous relationship.  Law will only punish her for killing a man and not for the adulterous relationship that is not defined as “crime” by any legal standards. It is not illegal but is it ethical? If you start to derive ethical standards from religion, Ten Commandments says – thou shall not covet. The full text goes like this –

You shall not covet your neighbor’s house. You shall not covet your neighbor’s wife, or his male or female servant, his ox or donkey, or anything that belongs to your neighbor. — Exodus 20:17

 Ten Commandments are considered as literal word of God. Texts from other religions have similar words for adultery. Technically, the literal translation is it is for the man to not to covet someone’s “wife”. It doesn’t give similar instructions for to covet someone’s “husband”. So, can we say this rule applies only to men and not women? I was joking as “#Feminism #WomenEmpowerment” to my sister. To quote Merriam-Webster, ethics is “rules of behavior based on ideas about what is morally good and bad”. To put in context, those were the good old days (#Sarcasm) when women are treated as the property of men and not as individuals. Now these words make sense in that setting. We have evolved to give equal rights to women and thus women are accountable morally in case of one pursues a married man. Ethical standards evolve as the acceptable norms of society. As the world shrinks with technology and human migration, we moved into new realms like multi-cultural, multi-ethnic realms that directed us to new codification of human rights, civil rights, and animal rights. With technology comes the big data elephant that rampages both positive and negative disruptions in our daily lives. It is time to address the elephant in our living room. As the legal standards of data world are beginning to evolve, we should start the debate on the ethical qualms of data rights.      

Elephant in The Room

Elephant in The Room

Posted in Big Data, Conceptual, ethics | Tagged , | Leave a comment

Global Significant Earthquakes

Global Significant Earthquake database by National Oceanic and Atmospheric Administration compiled the dataset of significant earthquakes that rocked the world since 2150 BCE to present. Pacific ring of fire has more pronounced tsunamis triggered by earthquakes. The story is here.

Earthquakes

Earthquakes

Posted in Visualization | Tagged , | Leave a comment

World Satellites Exploration

UCS Satellite Database has the list of satellites orbiting earth. As of 1st of June 2017, we have 1459 satellites orbiting earth.  Country of origin indicates the country that is registered as responsible for the satellite in the UN Register of Space Objects. “NR” indicates that the satellite has never been registered with the United Nations. Operator country is the one that operates or owns the satellite. Contractor country represents the country or business entities responsible for the satellite’s construction. United States leads the race in every single category. India picking up pace with 97 satellites launched from Satish Dhawan Space Center and with construction of 44 satellites. Atlas V that launched most satellites was formerly operated by Lockheed Martin, and is now by the Lockheed Martin-Boeing. Most number of satellites are launched in the past five years and are used for commercial communications. Explore more details here.

World Satellites

World Satellites

Posted in Visualization | Tagged | Leave a comment

Primum Non Nocere

I want to discuss one of the lesser known study in the world. The study is named “Mushroom Trial” that was spearheaded by my loving mother and the subjects were my immediate family. As much as she loves to try new recipes, in the late 90’s she found a new “vegetable” to cook with commonly called as mushrooms. We were all thrilled to try new stuff although I voiced my concern “but that is fungus”. All of us enjoyed the meal and I didn’t have my usual portion because of my fungus apprehensions.  20 minutes later, I had intense stomach cramps and started throwing up. My mom was annoyed by the unwarranted trip to ER for non-stop projectile vomiting on something as innocent and pure as farm-fresh, organic ingredients, and love. She assumed that I was jumping around and climbing trees in the backyard after my meal. The next time she cooked mushrooms, she asked me to have a generous helping. This time results were more ominous – started vomiting bile with pronounced rashes. This time, I wasn’t jumping around and in the ER she assumed the kind of mushrooms she cooked could be the reason. Not wanting to upset my mother, I grew numb of apprehension while increasing wary of physical distress. Eventually the doctor in ER ended the mushroom trial by “can we just say she is allergic to all kinds of mushrooms and save everyone time?” Overall, her trial went like this –

Mushroom Study

Mushroom Study

Now, let us analyze the data from this study:

Informed Consent

The most essential part of data collection or study is informed consent. Institutional Review Board (IRB) requires informed consent in research that involves human subjects unless you get a waiver or alteration due to the sensitive nature of the study. There are two parts here “inform” and “consent”. You got to inform the human subject and get their consent. In this trial, mom did inform everyone about mushroom but not necessarily the consent. That brings another interesting question, although it was one study but there were individually 5 different experiments. So, do we need informed consent at the beginning or do we need to keep it going as we change the parameters of the study? Informed consent is a more than a form or paper, it is a process that we should have the respect for individuals. Any research is voluntary.

Data Inputs

There are several issues with the data inputs. It was a poor selection of 5 people in a big family tree and it is also biased for the immediate family. More than this study also promotes a historical bias “mom knows better”. My mom never intentionally tried to harm me. Towards the end, I did raise an objection that it makes me sick. Unlike United States, food allergies are not that prevalent in India to a point that even the doctor took time to come up to the conclusion. This dataset is also incomplete as in I was not sure if I had the same “but that is fungus” apprehension after the first time. I just went with my mom trying not to upset her. In a way, it is incorrect to conclude “allergic to all mushrooms”. The data is outdated as in this dataset is from my childhood and some allergies can fade as the kid grows.

Algorithmic Bias

The problem with this study is assuming correlation implies causation. It is known that not to engage in strenuous physical activity after meals and not all kinds in the same food group could cause adverse reaction. These two correlations were completely wrong about the true cause. When personalization or recommendation algorithms are built on incorrect assumptions and bad data inputs, it could skew the possibilities of expansion and could create a tunnel vision.

Data Privacy

Aside from talking connections is philosophical way, we are connected more than ever with Internet of Things. In this connected world and the information is literally at your fingertips, what has become of privacy? I chose to expose this study and dataset in my website – that exposed my immediate family. My family is very private. I have the most social media presence. I made the dataset public. Have I violated their privacy concerns? Not exactly by the word of law. Are there ethical violations? There is a very little chance that my parents read this and get upset about it. There is a possibility that my siblings could read (usually I’ve ask them to read), if they read and raise concerns, what are the possible actions? Should I say “my blog, my stories” or “offer them some form of compensation” or “take down this post”.

Technological advancements, such as big data opens doors to endless possibilities everywhere. As always, with great power comes great responsibilities. Data laws are still in the primary stages of evolution. Ethics can get polarizing and controversial as with any issue in this country. Study after study, I read shows how we are chartering in unexplored waters when the real world get increasingly complex. That reminds me the principal percept in medicine and bioethics – “First, do no harm!”. Data world could start from there too. I want to share my musings on the one area where should keep our focus on – ethics – as our digital universe explodes.

Posted in Big Data, Conceptual, ethics | Tagged , , , | Leave a comment