Boston's Temperature Chart

At the end of January I will be moving to Boston. I will start my post-doc at the Boston Children’s Hospital. So… I started looking to weather and temperature conditions. I used Weather Underground to download a weatehr tamble for each month in 2016 and 2017. The aim is to create a plot with everyday mínimum and maximum temperature along all 2017. Also a heat-map indicating the weather condition of each day of the year.

minfi betas and residuals from methylation models

In the HELIX project we decided to use residuals instead of M values for the methylation analyses. So, how we get the residuals of a basic lineal model? Libraries and Data First of all we load the libraries we will use in this test: library( limma ) # We use lmFit to fit the lineal model library( minfi ) # Methylation data is saved as a GenomicRatioSet library( SmartSVA ) # We want to compute the SVA to correct methylation data library( isva ) # " library( Biobase ) # We will sabe the residuals in an ExpressionSet Once the libraries are loaded we proceed to obtain the methylation data:

Exploring public NHANES data using Rcupcake

The Rcupcake package contains functions to query different databases through the BD2K RESTful API. BD2K RESTful API is an interface that provides access to different data sources, making easier data accessibility, analysis reproducibility and scalability. The package is installed via devtools using it’s GitHub URL (hms-dbmi/Rcupcake) library( Rcupcake ) Rcupcake package follows a four-step process to retrieve the data from a database: Start session Select the variables of interest Build the JSON query Run the query to obtain the data The start.

Comparing 'user' Internet connection from some Catalan research centers

Using the same technique seen in the old post “Comparing ping time between connections” I asked some colleges to run the following command in their research centers. ping www.google.com -c 200 > ping_google.txt So, I load the multiple ping-files to create a data.frame with the icmp_seq number, the time spend per ping and the institution where the ping was promoted. ping <- lapply( files, function( file ) { dta <- read.

From Barcelona to Boston in R

Today I am traveling to Boston to attend the BioC 2017: Where Software and Biology Connect. In this trip to Boston, I stop in Lisbon to take the transoceanic flight. Let’s see a map Boston-Barcelona “centered” using the package maps: library( maps ) xlim <- c( -140, 20 ) ylim <- c( 25, 50 ) map( "world", lwd = 0.75, xlim = xlim, ylim = ylim ) Map showing Spain and USA

Comparing ping time between connections

To perform this test I ping 200 times Google from my PC at ISGlobal (running Linux Mint in a Virtual Machine) and from my laptop (running native Ubuntu). I saved the output in two TXT files with a command like the following one: ping www.google.com -c 200 > ping_google_work_wifi.txt I processed both files in R to create a data.frame. pwm <- read.delim( file1, nrows = 200, skip = 1, header = FALSE, sep = " ", stringsAsFactors = FALSE ) pwm <- pwm[ , c( 6, 8 ) ] colnames( pwm ) <- c( "icmp_seq", "time" ) pwm$icmp_seq <- as.

Creating a jobs time-lime for resume in R

Let’s say we define a data.frame with the jobs I’ve got from 2008 to 2017: jobs <- data.frame( employer = c( "GICO", "TES", "UAB", "IFAE", "ISGlobal" ), year_start = as.Date( c( "2008-07-01", "2009-11-01", "2010-09-01", "2011-07-01", "2013-09-01" ) ), year_end = as.Date( c( "2009-10-31", "2010-07-31", "2011-06-30", "2012-09-30", "2017-08-31" ) ), id = 1, stringsAsFactors = FALSE ) The content of the data.frame is easy understandable: The employer shows the nanme of the compaty/institution who employed me.

R and regex: find all occurrences

In R there are many functions that work with a pattern written as a regular expression. Today I needed to deal with one of these functions: str_locate_all (doc) from stringr My goal was to find "223777_at [Chip: U133B]" in a series of strings like the following one: text <- "11753227_s_at [Chip: PrimeView]; 223777_at [Chip: HT_HG-U133B]; 223777_PM_at [Chip: U133_Plus_PM]; 48336_at [Chip: U95B]; 223777_at [Chip: GeneProfilingArray]; g13477210_3p_at [Chip: U133_X3P]; MmugDNA.4759.1.S1_at [Chip: Rhesus]; 11753227_s_at [Chip: HG-U219]; ADXECADA.

Unload (detach) a loaded R package

Gene-Enrichment in PsyGeNET's Main-Psychiatric-Disorders

PsyGeNET is a database that integrates information on psychiatric disorders and their genes (check its About page for more information). The current version of the database centered the information of three main psychiatric disorders: Alcoholism, Depression and Cocaine-Related-Disorders. Currently the author of PsyGeNET, Alba Gutiérrez, and me are developing an R package (PsyGeNET2R) to query the information stored into the database and to perform some analysis using this information. We thought that could be a good idea to perform an enrichment analysis on the three main psychiatric disorders given a list of genes of interest.