|
Home | Switchboard | Unix Administration | Red Hat | TCP/IP Networks | Neoliberalism | Toxic Managers |
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix |
|
|
For some strange reason data types are known in R as "modes" . There are a large number of functions that allow you to determine the type (aka mode) of a variable at runtime. For example the is.numeric function can determine if a variable is numeric:
> a <- 1 > is.numeric(a) [1] TRUE > is.factor(a) [1] FALSE
The following modes are available
> a = TRUE > typeof(a) [1] "logical" > b = FALSE > typeof(b) [1] "logical"
The standard logical operators can be used:
< | less than |
> | great than |
<= | less than or equal |
>= | greater than or equal |
== | equal to |
!= | not equal to |
| | entry wise or |
|| | or |
! | not |
& | entry wise and |
&& | and |
xor(a,b) | exclusive or |
Character strings are actually single-element vectors of mode character, (rather than mode numeric):
A string is specified by using quotes. Like in Perl both single and double quotes will work:
> a <- "hello" > a [1] "hello" > b <- c("hello","there") > b [1] "hello" "there" > b[1] [1] "hello"
The name of the type given to strings is character,
> typeof(a) [1] "character" > a = character(20) > a [1] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
Length function allow you to determine the length of the string:
> length(a) [1] 20 > mode(a) [1] "character"
R has various string-manipulation functions. Many deal with putting strings together or taking them apart, such as the two shown here:
> u <- paste("abc","de","f") # concatenate the strings > u [1] "abc de f" > v <- strsplit(u," ") # split the string using blanks as separator > v [[1]] [1] "abc" "de" "f"
One interesting feature of R is that it does not have scalar variable. All scalar variables R are actually vectors with dimension one. That's a big difference with Perl, C++ and C. So instead of the term "scalar" some different term should be used.
To store a number in a variable you can use assignment statement which uses '<-' instead of '=", although strange enough '=' is allowed too.
> a <- 3 > a > 3
The “<-” (Assignment Operator) tells R to take the number to the right of the symbol and store it in a variable whose name is given on the left. It is a function in itself and can also be called explicitly as assign, with a string as parameter. That gives you an ability to create variables names from strings (and by extension from input files). For example
assign("a",3)The spaces around the assignment operators aren’t compulsory, but they help readability, especially with
<-
, so we can easily distinguish assignment from less than:
x<-
3 # assignment
x<
-3 # "less then operator
x<-
3
# this is an assignment which looks similar to less then negative number
You can also use the the traditions "=" symbol. When you make an assignment R does not print out any information. If you want to see what value a variable has just type the name of the variable on a line and press the enter key:
> a [1] 3
R allows you to do all sorts of basic mathematical operation and standard math functions. For example:
> b <- max(a,0)
R provides good access to symbol table of interpreter. For example, if you want to get a list of the variables that you have defined in a particular session you can list them all using the ls command:
> ls() [1] "a" "b"
ls()
From the help page:
‘ls’ and ‘objects’ return a vector of character strings giving the
names of the objects in the specified environment. When invoked
with no argument at the top level prompt, ‘ls’ shows what data
sets and functions a user has defined. When invoked with no
argument inside a function, ‘ls’ returns the names of the
functions local variables. This is useful in conjunction with
‘browser’.
Edit: I should note that to list ALL variables you would need to use
ls(all.names = TRUE)
otherwise variables that begin with a dot won't show up in the listing.
Vectors in R are one dimensional areas with some twists. The vector data type is really the heart of R
The elements of a vector must all have the same mode, or data type. You can have a vector consisting of three character strings (of mode character) or three integer elements (of mode integer), but not a vector with one integer element and two character string elements. The simplest was to create a vector is to use function vector
which produces
a vector of the given length and mode.
You can also create and initialize a vector using the c command:
> a <- c(1,2,3,4,5) > a [1] 1 2 3 4 5 > a+1 [1] 2 3 4 5 6 > mean(a) [1] 3 > var(a) [1] 2.5
You can get access to particular entries in the vector in the following manner:
> a <- c(1,2,3,4,5) > a[1] [1] 1 > a[2] [1] 2 > a[0] numeric(0) > a[5] [1] 5 > a[6] [1] NA
Note that the zero entry is used to indicate how the data is stored. The first entry in the vector is the first number, and if you try to get a number past the last number you get “NA.”
Examples of the sort of operations you can do on vectors is given in a next chapter.
To initialize a list of numbers the numeric command can be used. For example, to create a list of 10 numbers, initialized to zero, use the following command:
> a <- numeric(10) > a [1] 0 0 0 0 0 0 0 0 0 0
If you wish to determine the data type used for a variable the type command:
> typeof(a) [1] "double"
Function as.vector
, a generic, attempts to coerce its argument into a vector of mode
mode
(the default is to coerce to whichever vector mode is most convenient): if the result
is atomic all attributes are removed.
Function is.vector
returns TRUE
if x
is a vector of the specified
mode having no attributes other than names. It returns FALSE
otherwise.
For vectors [ ] <- is called not assignment but extract operator can provide some interesting operators
Factors are something like a members of a set or "‘enumerated type". In other this a set of names which correspond to a set of numeric value: Each name is unique assigned a number value. You can index using factors in which the number value of the factor will be used.
Another way that information is stored is in data frames. This is a way to take many vectors of different types and store them in the same variable. The vectors can be of all different types. For example, a data frame may contain many lists, and each list might be a list of factors, strings, or numbers.
There are different ways to create and manipulate data frames. Most are beyond the scope of this introduction. They are only mentioned here to offer a more complete description. Please see the first chapter for more information on data frames.
One example of how to create a data frame is given below:
> a <- c(1,2,3,4) > b <- c(2,4,6,8) > levels <- factor(c("A","B","A","B")) > bubba <- data.frame(first=a, second=b, f=levels) > bubba first second f 1 1 2 A 2 2 4 B 3 3 6 A 4 4 8 B > summary(bubba) first second f Min. :1.00 Min. :2.0 A:2 1st Qu.:1.75 1st Qu.:3.5 B:2 Median :2.50 Median :5.0 Mean :2.50 Mean :5.0 3rd Qu.:3.25 3rd Qu.:6.5 Max. :4.00 Max. :8.0 > bubba$first [1] 1 2 3 4 > bubba$second [1] 2 4 6 8 > bubba$f [1] A B A B Levels: A B
Note that there is a difference between operators that act on entries within a vector and the whole vector:
> a = c(TRUE,FALSE) > b = c(FALSE,FALSE) > a|b [1] TRUE FALSE > a||b [1] TRUE > xor(a,b) [1] TRUE FALSE
Experienced programmers typically find several aspects of the R language unusual. Here are some features of the language you should be aware of:
Variables declared in function are local to this function. However, unlike C, C++ or many other
languages, brackets do not determine the scope of variables. <-
does assignment in the
current environment. When you're inside a function R creates a new environment for you. By default
it includes everything from environment in which it was created so you can use those variables as
well but anything new you create will not get written to the global environment. Operator <<-
will assign to variables already in the global environment or create a variable in the global environment
even if you're inside a function. However, it isn't quite as straightforward as that. What it does
is checks the current environment for a variable with the name of interest. If it doesn't find it
in your current environment it goes to the parent environment (at the time the function was created)
and looks there. It continues upward to the global environment and if it isn't found in the global
environment it will assign the variable in the global environment.
The period (.) has no special significance in object names. But the dollar sign ($) has a somewhat analogous meaning, identifying the parts of an object. For example, A$x refers to variable x in data frame A.
R doesn’t provide multiline or block comments. You must start each line of a multiline comment with #. For debugging purposes, you can also surround code that you want the interpreter to ignore with the statement if(FALSE){...}. Changing the FALSE to TRUE allows the code to be executed.
Assigning a value to a nonexistent element of a vector, matrix, array, or list will expand that structure to accommodate the new value. For example, consider the following:
> x <- c(8, 6, 4) > x[7] <- 10 > x [1] 8 6 4 NA NA NA 10
The vector x has expanded from three to seven elements through the assignment.
x <- x[1:3] would shrink it back to three elements again.
R doesn’t have scalar values. Scalars are represented as one-element vectors.
Indices in R start at 1, not at 0. In the vector earlier, x[1] is 8.
Variables can’t be declared. They come into existence on first assignment.
|
Switchboard | ||||
Latest | |||||
Past week | |||||
Past month |
R How to convert string to variable name - Stack Overflow
How to convert string to variable name?
up vote24down votefavorite 12
I am using R to parse a list of strings in the form: original_string<-"variable_name=variable_value"
First, I extract the variable name and value from the original string and convert the value to numeric class.
parameter_value<-as.numeric("variable_value") parameter_name<-"variable_name"
Then, I would like to assign the value to a variable with the same name as the parameter_name string.
variable_name<-parameter_value
What is/are the function(s) for doing this?
string r
improve this question asked May 17 '11 at 17:22 KnowledgeBone
3741213add a comment |
4 Answers 4
votes
up vote43down voteaccepted assign is what you are looking for. assign("x", 5) x [1] 5
but buyer beware.
See R FAQ 7.21 http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-turn-a-string-into-a-variable_003f
improve this answer answered May 17 '11 at 17:27 Greg
4,97912022add a comment |
1 +1 for FAQ 7.21 reference – Ben Bolker May 17 '11 at 18:01
up vote19down vote You can use do.call: do.call("<-",list(parameter_name, parameter_value))
improve this answer answered May 17 '11 at 21:03 Wojciech Sobala
3,383518add a comment |
3 +1 for thinking. People (me included) usually forget that <-
is a function itself. – Rob Oct 25 '12 at 18:32
up vote6down vote strsplit
to parse your input and, as Greg mentioned,assign
to assign the variables.original_string <- c("x=123", "y=456") pairs <- strsplit(original_string, "=") lapply(pairs, function(x) assign(x[1], as.numeric(x[2]), envir = globalenv())) ls()
improve this answer answered May 17 '11 at 17:41 Richie Cotton
44k979175add a comment |
up vote4down vote use x=as.name("string") you can use then use x to refer to the variable with name string. I dunno if it answers your question correctly
The R type system
R is a weird beast. Through it's ancestor the S language, it claims a proud heritage reaching back to Bell Labs in the 1970's when S was created as an interactive wrapper around a set of statistical and numerical subroutines. As a programming language, R takes ideas from Unix shell scripting, functional languages (Lisp and ML), and also a little from C. Programmers will usually have at least some background in these languages, but one aspect of R that might remain puzzling is it's type system.
Because the purpose of R is programming with data, it has some fairly sophisticated tools to represent and manipulate data. First off, the basic unit of data in R is the vector. Even a single integer is represented as a vector of length 1. All elements in an atomic vector are of the same type. The sizes of integers and doubles are implementation dependent. Generic vectors, or lists, hold elements of varying types and can be nested to create compound data structures, as in Lisp-like languages.
Fundamental types
- vectors
- an ordered collection of elements all of one type
- atomic types: logical, numeric (integer or double), complex, character or raw
- special values:
- NA (not available, missing data)
- NaN (not a number)
- +/-Inf (infinity)
- lists
- generic vectors, elements can be of any type, including list
- because they can be nested, lists are sometimes called recursive
- functions
- functions are "first class" data types
- can be assigned, passed as arguments and returned from functions
# a is a vector of length 1 > a <- 101 > length(a) [1] 1 # the function c() combines is arguments # construct a vector of numeric data and access its members > ages <- c(40, 36, 2, 38, 27, 1) > ages[2] [1] 36 > ages[4:6] [1] 38 27 1 > movie <- list(title='Monty Python\'s The Meaning of Life', year=1983, cast=c('Graham Chapman','John Cleese','Terry Gilliam','Eric Idle','Terry Jones','Michael Palin')) > movie $title [1] "Monty Python's The Meaning of Life" $year [1] 1983 $cast [1] "Graham Chapman" "John Cleese" "Terry Gilliam" "Eric Idle" "Terry Jones" "Michael Palin"Attributes
R objects can have attributes - arbitrary key/value pairs - attached to them. One use for this is that elements in vectors or lists can be named. R's object system is based on the class attribute. (OK, I really mean the simpler of R's two object systems, but let's avoid that topic.) Attributes are also used to turn one-dimensional vectors into multi-dimensional structures by specifying their dimensions, as we'll see next.
Matrices and arrays
Matrices and arrays are special types of vectors, distinguished by having a dim (dimensions) attribute. A matrix has two dimensions, so the value of its dim attribute is a vector of length 2 specifying numbers of rows and columns in the matrix. Arrays are n dimensional vectors, sometimes used like an OLAP data cube, with dimension vectors of length n.
# create some data series > bac = c(14.08, 7.05, 13.05, 16.21) > hbc = c(48.67, 29.51, 41.93, 55.82) > jpm = c(31.53, 28.14, 33.77, 41.37) # create a matrix whose rows are companies and columns are quarters # values in the matrix is closing stock price on ) <- c('q1', 'q2', 'q3', 'q4') > m q1 q2 q3 q4 bac 14.08 7.05 13.05 16.21 hbc 48.67 29.51 41.93 55.82 jpm 31.53 28.14 33.77 41.37 # check out the attributes > attributes(m) $dim [1] 3 4 $dimnames $dimnames[[1]] [1] "bac" "hbc" "jpm" $dimnames[[2]] [1] "q1" "q2" "q3" "q4"Factors
Statisticians divide data into four types: nominal, ordinal, interval and ratio. Factors are for the first two, depending on whether they are ordered or not. This makes a difference for some of the stats algorithms in R, but from a programmers point of view, a factor is just an enum. R turns character vectors into factors at the slightest provocation. It's sometimes necessary to coerce factors back to character strings, using as.character().
- represent categorical or rank data compactly
- examples: countries, male/female, small/medium/large, etc.
Data frames
A data frame is a special list in which all elements are vectors of equal length. It is analagous to a table in a database, except that it's column-oriented rather than row-oriented. Because the vectors are constrained to be of the same length, you can index any cell in a data frame by its row and column.
- a list of vectors of the same length (columns)
- like a table in a database
# make a simple data frame > df <- data.frame(ticker=c('bac', 'hbc', 'jpm'), market.cap=c(137.37, 185.65, 157.80), yield=c(0.25,3.00,0.50)) > df ticker market.cap yield 1 bac 137.37 0.25 2 hbc 185.65 3.00 3 jpm 157.80 0.50There's more, of course, but this gives you enough to be dangerous. Note that, because R natively works with vectors, many operations in R are vectorized, meaning they operate on whole vectors at once, rather than on a single scalar value. The key to performance in R is making good use of vectorized operations. Also, being functional, R inherits a full compliment of higher-order functions - Map, Reduce, Filter and many forms of apply (lapply, sapply, and tapply). Mixing higher-order functions and vectorized operations can get confusing (and is the source of the proliferation of apply functions). Both these techniques, as well as the organization of the type system, encourage you to work with blocks of data as a unit. This is what John Chambers called high-level prototyping for computations with data.
More information
- more R stuff in Digithead's lab notebook
- Quick R: Data Types
- An Introduction to R
- The R Language Definition
- An introduction to R (by Martin Morgan of FHCRC).
- R programming for those coming from other languages by John Cook
- A Brief History of S by Richard A. Becker
- Cyclismo's R Tutorial: Basic Data Types
Posted by Christopher Bare at 9:55 PM
Google matched content |
...
John Cook's excellent blog post, R programming for those coming from other languages (www.johndcook.com/R_language_for_programmers.html).
Programmers looking for stylistic guidance may also want to check out Google's R Style Guide (http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html).
Society
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
Quotes
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
Classic books:
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|
You can use PayPal to to buy a cup of coffee for authors of this site |
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.
Last modified: October, 16, 2019