Sunday, September 20, 2015

Trans-Siberian Railway 2015 #1

9000 km, 25 days, All by train
Xi'An -> Beijing -> Ulaan Bator -> Irkutsk -> Moscow -> St Petersburg

This is a several part series to document my journey on the Trans-Siberian Railway(TSR) in 2015. 
This wasn't an easy trip to plan. Hence the main purpose here is to pass on my knowledge to anyone else who wishes to go for this trip. 
This first post will highlight what you need to know when you start planning for your trip.

Content of this post:
1. What is Trans-Siberian Railway(TSR)?
2. Brief summary of my journey.
3. First thing to consider: When should you go?
4. Second thing to consider: How long do you have to travel?
5. Third thing to consider: What is your budget?
6. Should I go on this trip?
Taken from the train.

What is Trans-Siberian Railway?

The Trans-Siberian Railway is a network of trains that connects Moscow (West Russia) to eastern Russia. There are several branching off of the railway that goes past Mongolia and China.

Image from telegraph.co.uk
Although there are 3 routes to take, collectively this is still called the Trans-Siberian Railway. I took the Trans-Mongolian Route starting from Beijing, through Mongolia and then Russia. Impressively, it takes about 7-8 full days to travel from BJ to Moscow directly. The whole journey spans across seven time zones and is about 9000km. I don't think many people would try to do such a direct long journey since there are many places to visit on the way. 

Brief summary of my journey

Taken in Mongolia. A lone baby camel walking back to the nomad's camp.
  • 8th May 2015: One day after my final exam in NUS, I left on a flight from Singapore to Xi'An China.
  • 9th - 10th May 2015: Explored Xi'An and took a train to Beijing.
  • 11th - 13th May 2015: Explored Beijing and embarked on the Trans-Siberian Railway. (Start of TSR)
  • 14th May 2015: Arrived at Ulaan Baator Mongolia after a 27 hours train ride.
  • 15th - 18 May 2015: Explored a region of Mongolia. Left for Irkutsk at night.
  • 19th May 2015: On the train to Irkutsk.
  • 20th - 24th May 2015: Explored Irkutsk and stayed on Olkhon Island for a few days.
  • 25th - 27th May 2015: On the train to Moscow.
  • 28th - 30th May 2015: Explored Moscow and left for Saint Petersburg(SP). (End of TSR)
  • 31st May - 1 Jun 2015: Explored SP.
  • 2 Jun 2015: Home.
Reasons for some travel decisions:
  • Xi'An and SP are not part of the Trans-Siberian railway but they were some destinations that we personally really wanted to go. 
  • I chose to start from Beijing because I wanted my trip to end near Europe. At that time I was considering to visit Turkey before flying back to Singapore. (in the end I did not)
  • Month of May: Because we just ended our final exams in NUS and my work will start in late June.
Taken on the train. Cute Austrian neighbor.

First thing to consider: When should you go?

  • February and March, for the frozen Lake Baikal (more on that later). However, you would have to take into account of the harsh weather.
  • May. I went during the month of May and the weather was still a bit cold. On most days, I just had to don on a jacket or two to keep myself warm. As this was before the peak tourist season, we were able to have the whole 4-men train berth to ourselves for a few legs of our train journey.
  • July seems to be the peak tourist season in Mongolia due to the festivities and good weather.
  • You may explore other months too.

Second thing to consider: How long do you have to travel?

  • Beijing: I think you will need about 3-4 days to visit the main attractions in BJ e.g. Great Wall, Tian An Men, Forbidden City, 
  • Ulaan Baator, Mongolia: Ulaan Baator is the capital of Mongolia. Recommended days to spend in Mongolia is at least 1 week. If you have the luxury of time, you should spend at least 3 weeks because every part of Mongolia is very different.
  • Irkutsk, Russia: 4-5 days because usually visitors will stay a few days on Olkhon Island.
  • There are several other cities between Irkutsk and Moscow. However I don't think they will be your priority if you don't have a lot of time.
  • Moscow, Russia: 3-4 days is enough for you to visit most of the attractions around the Kremlin. The museums are quite interesting although some of them are in Russian. There are also free tours organised daily(?). It is a nice gesture for you to tip them since they are usually quite good.
  • Train rides: No matter where you decide to stop, it will still take you about 7-8 days from Beijing to Moscow. Do take this into account.
In total, I spent 19 days on the TSR. I will be breaking them down in my subsequent post. However I felt that 19 days were sufficient for me to visit most of the attractions (maybe I could have spent more time in Mongolia). 

Third thing to consider: What is your budget?

Taken on the train. This Mongolian man stayed in our berth and he gave us horse meat for lunch. 
  • Air and train tickets
    • Air Ticket: Depends on where you are flying from and flying to after the trip. I spent about SGD$1000 in total. (From Singapore to Xi'An and from SP to Singapore)
    • Train Tickets: We booked 2nd class tickets for the entire TSR. We also spent about SGD$1000. Yes train tickets are not cheap and you can save much more if you bought 3rd class tickets. 
  • Food and Accommodation: 
    • This is a very personal decision based on your lifestyle and level of comfort. 
    • We stayed in airbnbs and guesthouses.
    • We ate bread and instant noodles on the trains and very few restaurants.
    • In total we spent about < SGD$1000.
  • Museum tickets and tours:
    • We visited a number of museums in China, Mongolia and Russia. Do bring your student pass if you wish to enjoy the discounted tickets. 
    • After doing some research, it is not recommended for us to rent a car and explore Mongolia on our own. Hence, we paired up with another two travelers on a tour in Mongolia and got cheaper tour prices. Approximately USD$62 per person per day which includes guide, accommodation, food and transport. We used the services by Khongor and it was a positive experience.
    • We were also on a short paid tour in Olkhon island.
    • Tours in Moscow and SP were free. However we paid for other walking tour in Moscow because we found the tour guide to be very good and approachable.
I spent approximately ~$3k SGD for this trip. More money could be saved if we chose 3rd class train tickets. Otherwise we were rather frugal in our spendings. 

Should I go on this trip?

  • Yes, only if you have 3 or more weeks of holiday. Don't rush, take your time and immerse in your trip. 
  • Yes, only if you have enough money. At least $3k SGD is required. Less if you buy 3rd class train tickets and avoid proper food altogether. 
  • Yes, only if you can sit on a train for at least 3 consecutive days. Our longest leg (Irkutsk to Moscow) took us 3 days to complete. During that train ride, we read books, listened to music, ate bread/cup noodles, drank vodka and not much else.

I will be continuing the rest of this TSR journey in the subsequent posts.

Monday, April 20, 2015

Dealing with NAs in your dataset (R)

Content Summary

  1. Testing for NA and NaN using is.na and is.nan
  2. Replacing the NA or NaN with mean or median.
  3. Removing rows with NA or NaN in Age from the dataset
  4. Removing rows with NA or NaN in every columns from the dataset
  5. na.omit

Introduction

In some situations, you may find yourself with a dataset littered with NAs or NaNs and they are preventing you from carrying out the next step of your analysis. NA stands for “Not Available” while NaN stands for “Not a Number”. It could be due to missing data in your survey results or maybe some illegal operations.
In this post, I will be covering some of the common ways to deal with them in a dataset. I will be using a portion of the famous Titanic dataset to illustrate my examples.

Quick look at the dataset

First we read in our data:
setwd("~/Downloads")        # setting working directory
titanic <- read.csv("test.csv", header=TRUE)

A quick glance at the first 6 rows of the dataset:
head(titanic, 6)
##   PassengerId Pclass                                         Name    Sex
## 1         892      3                             Kelly, Mr. James   male
## 2         893      3             Wilkes, Mrs. James (Ellen Needs) female
## 3         894      2                    Myles, Mr. Thomas Francis   male
## 4         895      3                             Wirz, Mr. Albert   male
## 5         896      3 Hirvonen, Mrs. Alexander (Helga E Lindqvist) female
## 6         897      3                   Svensson, Mr. Johan Cervin   male
##    Age SibSp Parch  Ticket    Fare Cabin Embarked
## 1 34.5     0     0  330911  7.8292              Q
## 2 47.0     1     0  363272  7.0000              S
## 3 62.0     0     0  240276  9.6875              Q
## 4 27.0     0     0  315154  8.6625              S
## 5 22.0     1     1 3101298 12.2875              S
## 6 14.0     0     0    7538  9.2250              S

Seems that everything is alright. However if we try to calculate the mean of the Age variable, we face some problems:
mean(titanic$Age)
## [1] NA

Looking at the first 100 entries of the Age variable, you can see that there are a lot of NAs inside that column. This results in the NA output when we try to calculate its mean.
titanic$Age[1:100]
##   [1] 34.5 47.0 62.0 27.0 22.0 14.0 30.0 26.0 18.0 21.0   NA 46.0 23.0 63.0
##  [15] 47.0 24.0 35.0 21.0 27.0 45.0 55.0  9.0   NA 21.0 48.0 50.0 22.0 22.5
##  [29] 41.0   NA 50.0 24.0 33.0   NA 30.0 18.5   NA 21.0 25.0   NA 39.0   NA
##  [43] 41.0 30.0 45.0 25.0 45.0   NA 60.0 36.0 24.0 27.0 20.0 28.0   NA 10.0
##  [57] 35.0 25.0   NA 36.0 17.0 32.0 18.0 22.0 13.0   NA 18.0 47.0 31.0 60.0
##  [71] 24.0 21.0 29.0 28.5 35.0 32.5   NA 55.0 30.0 24.0  6.0 67.0 49.0   NA
##  [85]   NA   NA 27.0 18.0   NA  2.0 22.0   NA 27.0   NA 25.0 25.0 76.0 29.0
##  [99] 20.0 33.0

Dealing with the NAs

  1. Testing for NA and NaN using is.na and is.nan This is a basic function to test if an object is NA or NaN. is.na can be used for NAs and NaNs. However is.nan can only be used for NaNs.
vect <- c(1,2,3, NA, NaN)
is.na(vect)
## [1] FALSE FALSE FALSE  TRUE  TRUE
is.nan(vect)
## [1] FALSE FALSE FALSE FALSE  TRUE

  1. Replacing the NA or NaN with mean or median. A common way to deal with these numbers is to replace them with the mean or median of the other available data.
# first we calculate the mean of the available data in Age
age.mean <- mean(titanic$Age, na.rm=TRUE)
titanic$Age[is.na(titanic$Age)] = age.mean
age.mean
## [1] 30.27259
mean(titanic$Age)
## [1] 30.27259
We have successfully replaced all NAs by 30.27259. As a result, the output of mean(titanic$Age) is no longer NA.

  1. Removing rows with NA or NaN in Age from the dataset While removing data from your dataset is generally not a good approach, it is still useful in some case.
NROW(titanic)               # number of rows in the titanic dataset
## [1] 418
titanic <- titanic[!is.na(titanic$Age),]
NROW(titanic)
## [1] 418
This effectively removes all the rows with NA present in the Age column. Do note the use of exclaimation mark before is.na. In total 418 - 332 = 86 rows of data were removed.

  1. Removing rows with NA or NaN in every columns from the dataset Here we use another function complete.cases which returns TRUE if there are no NA at all in the entire row.
titanic <- titanic[complete.cases(titanic),]
NROW(titanic)
## [1] 417
Notice that the remaining rows left is 331 instead of 332. There is another NA present in another column (other than Age).

  1. na.omit This function returns a data frame with the rows containing NA in any columns removed. This is basically the same as 4.
titanic <- na.omit(titanic)
NROW(titanic)
## [1] 417

Sunday, December 28, 2014

Melaka 2014

(TLDR? Click here for more pictures or scroll down to see useful printable maps for most recommended places of interest. I also included a rough itinerary for a 2 day 1 night visit to Melaka during the weekend.)
Hello! (Okay that's not me)
What is Melaka (Malacca)?
Melaka is the capital of the state of Malacca which is on the west coast of peninsular Malaysia. It is one of the UNESCO world heritage sites in Malaysia because of its rich historical and cultural background from previous Portuguese, Dutch and British rule. 



How do I get from Singapore to Melaka (and back)?
The best way to get to Melaka is pretty much by coach.
  1. (Straightforward) You can choose to take a coach from Singapore to Melaka. In this case, you can choose from a large selection of coach companies. The coaches differs by prices, pick-up and drop-off locations and timing. This time I chose to go by 707 Inc and return by Transtar Travel Pte Ltd. In total, I paid SGD 56. It took us about 6 hours to reach Melaka with 1 rest stop in between. (To add, I encourage taking Transtar Travel because I had a much better travelling experience compared to 707's. I won't elaborate much on the details for now. If interested, let me know in the comments section.)
  2. (Cheaper and recommended) You can also choose to take a coach from JB to Melaka. JB is just across the Woodlands Causeway which is super convenient for Singaporeans living in the North. You need to take a bus from JB Larkin Terminal. The Larkin Terminal is very close to the Malaysia checkpoint / City Square. A round trip from JB to Melaka will cost you about MYR 40 which is less than SGD 20. Definitely a lot cheaper than the first option. 
You can obtain all coach details and purchase your tickets on easybook.com.

Accommodation
Seen on Uncle Ringo's Foyer Entrance
This really depends on your budget. Travelling on budget, I stayed in a guesthouse, in particular Uncle Ringo's Foyer
So what to expect? I booked a double room for 3 nights which cost me about MYR 120 in total (MYR 60 per pax). At this price range, the guesthouse provides the bare necessities. For the double room, you get 2 beds, 1 power socket, air con, blanket and pillow. Toilet facilities are communal. As it was a guesthouse, you can choose to interact with the other backpackers or just go on your own ways. The owner, Raymond(?), is also super friendly and will answer any questions you may have about Melaka. Personally, this place is clean and good enough for me. It is just a place for me to sleep at night as I would be out all day. I don't really care about the rest. It is also a 5 minutes walk to Jonker's Street. Overall, my stay was pleasant and I do recommend this guesthouse.

You can book on Hostelworld.com which is reliable and good enough. 

So what should I do in Melaka?
Personally, I believe that one should always experience the city like a local. So I managed to contact a couch surfer who is a Melaccan working in Singapore. On the first day, she drove us around Melaka to visit different places of interest and introduced us to some of their famous local food.

In this section, I shall highlight the main sections of my trip. There will be a map below.
  1. Heritage and Architecture:
    • Being a UNESCO world heritage site, you can expect to see a lot of buildings left behind by the Dutch and Portuguese. Most of these buildings are located at St Paul's Church and Portuguese Square. After a long walk, remember to visit A'Famosa before retreating to the comfort of the air-conditioned shopping malls.
      At the Portuguese Square
    • Melaka is mostly of Peranakan culture. You can take a journey back to Old Malacca by walking along Heeren Street (also known as Tun Cheng Lock Street). Be sure to visit the various museums that can be found nearby. Definitely visit No 8 Heeren Street Heritage Centre to look at one of the traditional shophouses in Melaka. UNESCO has restored and maintained 9(?) of these shophouses. The rest are either poorly maintained or demolished or modified. 
      Inside of a traditional shophouse
    • Take a long afternoon walk along the streets of Melaka. Besides the famous Jonker Street and recommended Heeren Street. Immerse into Melaka by walking along the other streets near Jonker Street. There are treats around every corner.
  2. Night Market
    • On every Saturday and Sunday night, Jonker Street will be transformed into a bustling night market filled with stalls selling clothing, handicrafts and food. The night market starts at about 6-7pm to around 11-12 midnight. There are so many things to see and food to eat. If you are tired, you can retreat back into one of the "cafes" nearby, in particular Geographer's Cafe, which is actually more of a bar. 
      Chilling at Geographer's Cafe

      Fried Carrot Cake (Actually made of radish)
  3. Shopping
    • Other than the night market, there are plenty of shops around Jonker Street that sell clothing and handicrafts. If it is raining or the weather is too hot, there are a few shopping malls that are located about 20 minutes walk from Jonker Street. The shopping malls are mainly Pahlawan, Mahkota Parade and Heeren Square.
  4. Food
    • Melaka is famous for their Baba and Nyonya food: Some of my recommended shops are Donald and Lily's and Jonker 88. They serve amazing authentic nyonya food like Laksa, Nasi Lemak and Chendol. Omg omg. Definitely get these. 
      Assam Laksa. (Sedap Sekali!)
    • Chicken Rice Ball: This is not my personal favourite Melaka food because Nyonya still wins. You can identify the famous Chung Wah Chicken Rice Ball very easily along Jonker Street. I have been advised by the locals that the queue is not worth it and you are better off getting your chicken rice ball anywhere else. 
      Chicken rice ball (not Chung Wah)
    • Melaka Satay Lup Lup: Basically it is like the chinese steamboat but instead of soup, you get peanut gravy as your soup base. The local recommended Capitol Satay Celup
      Lup Lup!
    • Our guesthouse owner highly recommended Pak Putra which is a tandoori restaurant. From the way he described it, it seems like a must-go destination.
    • Cafes: There are a few cafes located near Jonker Street. Personally I highly recommend Calanthe Cafe and I have been there at least 3 times out of my 4 days in Melaka. They serve really good coffee originated from all 13 states of Malaysia. This cafe is open till 10/11pm. 
      Ice blended with Melacca coffee. (Choice of up to 13 states coffee)

      Front
    • Local coffee shops: Visit the Lung Ann Refreshment which serves traditional toast with egg and coffee/tea.
      Lung Ann
    • Oh oh. Before I forget, definitely try the "Huang Di Bao" (Emperor bun) at the dim sum restaurant near Pak Putra. I think it is called Hoi Hiong Restoran. It is insanely awesome. I would never find out about this without our couch surfer friend. Basically it is a bun stuffed with pork (and chicken?) and some other fillings. It is then covered with glutinous rice. Result? An explosion of awesome taste when you cut it open. :D 
      Emperor Bun
Resources
  1. For more pictures of Melaka, visit my flickr.
  2. Useful websites: http://melakatravel.infohttp://wikitravel.org/en/Malacca
  3. Maps for your use. On these 4 maps, there are enough activities to do and thing to see for about 2-3 days. Stars indicate the places that I recommend.
Overview
A
B
C
Rough Itinerary
Day 1
  • Assuming you took the earliest coach on Saturday morning. You will reach Melaka in the afternoon. After checking in and resting in your guesthouse/hotel, walk around Area A (which is around Jonker Street) and visit the night market at night. Recommend visiting food places such as Pak Putra, food stalls at the night market and Satay Lup Lup
  • Note: My guesthouse was located near Jonker Street.
Day 2
  • Wake up early in the morning and walk to Donald and Lily's (Area B) for a great breakfast. Slowly head back to Area C and tour around the historic sites. You may also visit the museums  in Area A along Heeren Street
  • Optional to visit the shopping malls. I know the goods are slightly cheaper here but I don't think one leaves Singapore just to visit more shopping malls.
Thanks for reading and have fun touring Melaka :)

Saturday, December 13, 2014

ST3236 / MA3238 Stochastic Processes 1


Module Description
This module introduces the concept of modelling dependence and focuses on discrete-time Markov chains. Major topics: discrete-time Markov chains, examples of discrete-time Markov chains, classification of states, irreducibility, periodicity, first passage times, recurrence and transience, convergence theorems and stationary distributions

Lecture Topics
- Probability
- Random Variables
- Expectation
- Variance
- Conditional Expectations
- Applications of conditional expectation
- Introduction to Markov Chains
- Chapman-Kolmogorov Equations
- Classification of States
- Long time behaviours
- Time Reversibility
- Branching Processes

This will be your first exposure to Markov Chains. It is a very interesting topic and fun to play with.
If you are a statistics major, make sure you pick this module in semester 1. Semester 1 will be taught by statistics department and will not be that rigorous on the proofs. Semester 2 will be taught by mathematics department. More theoretical, more proofs.

Workload
Two Two-hours lecture
Lectures were taught by Dr David Chew. He is generally a good lecturer. Very helpful and patient lecturer.

One two-hours tutorial
Tutorials were held in the classrooms. The tutorials are generally not difficult. Like I said, MC is very interesting and it's a joy to solve the questions.

Assessment 
20% 2 online assignments
20% Term Test
60% Final Exam

The two online assignments are conducted on IVLE. Check your answers with friends.

Final paper will be a 2 hours closed book paper. You will be allowed to carry two pieces of cheatsheet.

Personal
I found the topics are very interesting and maybe that's why I was able to do better. I also found the questions in the final paper do-able. As long as you are careful in your calculation, it should be alright. Double check and make sure everything tallies. Have fun with the transition matrices. :)

Extra notes to readers: I have lecture notes, tutorials and past year papers. Download them from my dropbox.

ST2131 / MA2216 Probability


Module Description
The objective of this module is to give an elementary introduction to probability theory for science (including computing science, social sciences and management sciences) and engineering students with knowledge of elementary calculus. It will cover not only the mathematics of probability theory but will work through many diversified examples to illustrate the wide scope of applicability of probability. Topics covered are: counting methods, sample space and events, exioms of probability, conditional probability, independence, random variables, discrete and continuous distributions, joint and marginal distributions, conditional distribution, independence of random variables, expectation, conditional expectation, moment generating function, central limit theorem, the weak law of large numbers. This module is targeted at students who are interested in Statistics and are able to meet the prerequisite. It is an essential module for Industrial and Systems Engineering students.

Lecture Topic
Sample Spaces, Probability and Combinations
Axioms of Probability
- Binomial, Bernoulli, Poisson Random Variables
Discrete Uniform, Geometric, Hypergeometric, Negative Binomial, Exponential, Gamma, Continuous Uniform and Normal Random Variables
Distribution of a Function of Random Variables
- Jointly Distribution Function
- Sum of Independent Random Variables
- Multidimensional Change of Random Variables and Bivariate Normal
- Properties of Expectation
- Properties of Expectation, Conditional Expectation
- Moment Generating Functions, Inequalities
- Central Limit Theorem (CLT)

This module is the first core module for a year 2 statistics major. Workload is medium. Concepts are quite alright. Nothing too difficult.

Workload
Two two-hours lecture
My lecturer was a visiting Prof from Israel. He uses handwritten notes which were so messy. The whole module felt so disorganised. He uploads the lecture notes 5-10 minutes before the class which was at 8 am. The shop was not even open for us to print notes.

One two-hours tutorial
Tutorials were held in the computer labs. Most of them were quite do-able, at least to me.

Textbook is compulsory. The Prof expects you to read it before the class.

Lectures were webcasted.

Assessment
5% Online Quizzes
15% Written Assignments
20% 1.5 hrs Midterm (MCQ)
60% 2 hrs Finals

The online quizzes consist of probability questions that you have to complete on IVLE. You can only do it once. There will be practice question to let you know the structure of the questions.

Mid term was very easy. Consist of 20 MCQ questions. Most of them test you on P&C.

Personal
Most people did very well for mid term.

Final paper was very difficult.. It is easily the hardest paper I have taken out of the 4 years worth of exams. There were 6 questions and I couldn't answer any of them fully. It's quite insane. For the first time I felt so helpless and thought I would fail terribly. I passed with average results, so thankful for that.

This module is also taken with the Maths and ISE majors. Good luck with the bell curve.

Not much practice needed. No need to mad mug the textbook. It's either you get it or you don't.

Extra notes to readers: I have lecture notes, tutorials, past year papers, textbook inside my dropbox. Cheers!