R

Comparing 'user' Internet connection from some Catalan research centers

Using the same technique seen in the old post “Comparing ping time between connections” I asked some colleges to run the following command in their research centers. ping www.google.com -c 200 > ping_google.txt So, I load the multiple ping-files to create a data.frame with the icmp_seq number, the time spend per ping and the institution where the ping was promoted. ping <- lapply( files, function( file ) { dta <- read.

From Barcelona to Boston in R

Today I am traveling to Boston to attend the BioC 2017: Where Software and Biology Connect. In this trip to Boston, I stop in Lisbon to take the transoceanic flight. Let’s see a map Boston-Barcelona “centered” using the package maps: library( maps ) xlim <- c( -140, 20 ) ylim <- c( 25, 50 ) map( "world", lwd = 0.75, xlim = xlim, ylim = ylim ) Using ggmap we can get the longitude and the latitude of the cities:

Designing rexposome's (hex-)sticker

A workmate from ISGlobal introduced me to the hexSticker from Guangchuang Yu. To install hexSticker on my Ubuntu I needed to install some system libraries: sudo apt-get install libtiff5-dev libfftw3-3 libfftw3-dev Then, the only unsatisfied dependency on my laptop was: BiocInstaller::biocLite( "ggimage" ) So, I installed hexSticker using devtools: devtools::install_github( "GuangchuangYu/hexSticker" ) The goal of the post was to create the hex-sticker for rexposome, my package/framework/project for exposome data analysis.

Comparing ping time between connections

To perform this test I ping 200 times Google from my PC at ISGlobal (running Linux Mint in a Virtual Machine) and from my laptop (running native Ubuntu). I saved the output in two TXT files with a command like the following one: ping www.google.com -c 200 > ping_google_work_wifi.txt I processed both files in R to create a data.frame. pwm <- read.delim( file1, nrows = 200, skip = 1, header = FALSE, sep = " ", stringsAsFactors = FALSE ) pwm <- pwm[ , c( 6, 8 ) ] colnames( pwm ) <- c( "icmp_seq", "time" ) pwm$icmp_seq <- as.

Creating a jobs time-lime for resume in R

Let’s say we define a data.frame with the jobs I’ve got from 2008 to 2017: jobs <- data.frame( employer = c( "GICO", "TES", "UAB", "IFAE", "ISGlobal" ), year_start = as.Date( c( "2008-07-01", "2009-11-01", "2010-09-01", "2011-07-01", "2013-09-01" ) ), year_end = as.Date( c( "2009-10-31", "2010-07-31", "2011-06-30", "2012-09-30", "2017-08-31" ) ), id = 1, stringsAsFactors = FALSE ) The content of the data.frame is easy understandable: The employer shows the nanme of the compaty/institution who employed me.

R and regex: find all occurrences

In R there are many functions that work with a pattern written as a regular expression. Today I needed to deal with one of these functions: str_locate_all (doc) from stringr My goal was to find "223777_at [Chip: U133B]" in a series of strings like the following one: text <- "11753227_s_at [Chip: PrimeView]; 223777_at [Chip: HT_HG-U133B]; 223777_PM_at [Chip: U133_Plus_PM]; 48336_at [Chip: U95B]; 223777_at [Chip: GeneProfilingArray]; g13477210_3p_at [Chip: U133_X3P]; MmugDNA.4759.1.S1_at [Chip: Rhesus]; 11753227_s_at [Chip: HG-U219]; ADXECADA.

Unload (detach) a loaded R package

Gene-Enrichment in PsyGeNET's Main-Psychiatric-Disorders

PsyGeNET is a database that integrates information on psychiatric disorders and their genes (check its About page for more information). The current version of the database centered the information of three main psychiatric disorders: Alcoholism, Depression and Cocaine-Related-Disorders. Currently the author of PsyGeNET, Alba Gutiérrez, and me are developing an R package (PsyGeNET2R) to query the information stored into the database and to perform some analysis using this information. We thought that could be a good idea to perform an enrichment analysis on the three main psychiatric disorders given a list of genes of interest.

Understanding hypergeometric tests

Hypergeometric test are useful to perform enrichment analysis. As I see, the most performed enrichment analysis is the one where people want to obtain a list of enriched GO terms given a list of genes. The hypergeometric test is the equivalent of the one-tailed Fisher’s exact test, giving the statistical confidence in p-values. For example, given a shuffled poker deck with no jokers we want to see if getting five random cards the result is diamond-enriched:

Christmas Card 2014 - 2015

As the previous years, I continued with the tradition of sending a geed Christmas Card to my workmates. This year I wrote it in R, since it is one of the most used languages at CREAL. The result of running the script I sent (just a source code that loaded a file on the cloud) is seen above. The source code for that can be found at my christmas-card’s github (link) as mc201415.