Inicio de sesión

Ya sé Excel,
pero necesito más

El manual que
estabas buscando


Ya sé Excel,
pero necesito más

El manual avanzado
que estabas buscando
www.necesitomas.com

En línea

En este momento hay 0 usuarios y 4 invitados en línea.

blogs.msdn.com

Distribuir contenido
The team blog for Microsoft Excel and Excel Services.
Actualizado: hace 11 horas 6 mins

UK Excel User Conference

12 Diciembre, 2008 - 09:16

Is Excel an important tool you can't live without?

Would you like to meet with others to learn more about how to optimize your usage of Excel?

Do you like London in the spring time?

If you answered Yes to any of these questions, you might be interested in the upcoming UK Excel User Conference.  It takes places on April 1st and 2nd, 2009.  There will be great speakers from the Excel community, and the topics look to be very informative.  And, it's free!

To learn more, check out the full details here.

Categorías: Excel, Inglés

Understanding Excel Services Load Balancing Options

12 Diciembre, 2008 - 09:06

Today's author, Steve Tullis, a Program Manager on the Excel Services team, talks about the various load balancing schemes available for Excel Services. 

I have received a number of questions in the past few weeks about the load balancer administration options available for the Excel Services custom load balancer.  Here’s the scoop:  the settings account for the “OpenWorkbook” method when first accessing Excel Services.  Once you have an active session with a workbook, additional requests from that user will always be sent to the ECS on which the “OpenWorkbook” request was handled.

So, why do we have three options – Workbook URL, Round Robin, and Local?  The options are meant to provide the admin better control over the resource usage on his/her farm.  An explanation of each option follows:

  • Workbook URL:  Arguably, this provides the most efficient use of your hardware.  Requests are sent from the WFE to an ECS based on a hash of the URL.  This ensures the same ECS will always handle requests for the same workbook – regardless of the user making the request.  Which means a workbook will only ever be retrieved from SharePoint once – until it is pushed out of the cache, or the file changes in SharePoint. 
  • Round robin:  Each OpenWorkbook request goes to the next ECS in the rotation.  The result is that the same wb can be loaded from SharePoint N times, where N = the number of ECS machines in your farm.  Keep in mind two things:  (1) requests against an active session will always return to the same ECS, and (2) if the wb is cached on the ECS when an OpenWorkbook request arrives, it will be loaded from cache provided the file has not changed in SharePoint.  The benefit with this option is that each concurrent request goes to a different server; thus, theoretically, queues on the server (CPU, I/O) will be shorter, thus requests can be handled more efficiently.  The effectiveness of this is dictated by the composition of your workbooks (wbs with very long running calculations tend to invalidate this).
  • Local:  This setting ensures ECS processing happens on the same WFE machine which received the request, which requires that the ECS service must be running on every WFE.  Again, the same wb can be loaded N times, where N = the number of ECS / WFE machines in the farm.  The benefit here is, theoretically, performance.  By running the ECS on the WFE, and reducing the # of server-to-server hops, end user performance should be faster.  The cost is in retrieval from SharePoint & cache size due to the potential of the same wb being open on multiple WFEs.

As you can see, there are pros and cons to each option.  But, with the information provided above, as well as a bit of knowledge and experimentation on your side, you should be able to correctly select the right load balancing option for your installation.

Categorías: Excel, Inglés

Using Excel to Optimize a Network Problem

4 Diciembre, 2008 - 07:59

Today's author, Diego Oppenheimer, a Program Manager on the Excel team, delves into some of the more advanced functionality of Excel to solve a challenging problem.  The file used in this blog post can be found in the attachments at the bottom of this post.

Throughout this next blog post I will be using a tool that comes with Excel called Solver. For more information on solver please visit the Office Online site (http://office.microsoft.com/en-us/excel/HP051983681033.aspx?pid=CH010004571033).

Background:

I am currently planning a vacation drive exploring the state of Washington. Ideally in the short amount of time I have off I would like to do as many hikes as possible. I will be starting from Seattle and finishing at Kettle Falls in eastern WA. Along the way I have identified 4 possible stopping places that I am interested in. From stop to stop I have researched how many hikes I can do along the way between each one of my stops. With a tool like Microsoft Live Maps I could easily plot the shortest distance between Seattle and Kettle Falls, but I am more interested in optimizing my trip to include as many hikes as possible.

For this purpose I have decided to use simple principles of linear programming and create a network model with Excel that would maximize my hiking time on my upcoming vacation. This problem in graph theory is defined as a shortest path problem. Shortest path algorithms are used in web mapping software such as Live Maps.

For a better graphical understanding of my journey I created the following map:

Without ever "moving backwards" I have routed all possible combinations from stop to stop from Seattle, WA to Kettle Falls, WA. I have also numbered my stops from left to right for easier referencing in my workbook.

I built the table below for a quick reference on driving time between the different stopping points. You will notice that I have put a number in brackets (X) representing the number of hikes on that particular part of the journey:

Ex: Seattle to Darrington - Time: 1.4 hrs with 2 hikes that I would be interesting in doing.

Seattle

Longmire

Darrington

Yakima

Wenatchee

Kettle Falls

Seattle

X

4 (5)

1.4 (2)

2.2 (3)

2.5 (5)

5.5 (6)

Longmire

4 (5)

X

4.2 (2)

2.3 (2)

4(2)

6.3 (4)

Darrington

1.4(2)

4.2 (2)

X

3.4 (3)

3.3 (5)

7 (3)

Yakima

2.2 (3)

2.3 (2)

3.4 (3)

X

2(1)

4.4 (3)

Wenatchee

2.5 (5)

4(2)

3.3 (5)

2(1)

X

4.2 (2)

Kettle Falls

5.5 (6)

6.3 (4)

7 (3)

4.4 (3)

4.2 (2)

x

The Setup:

Given that I love hiking I wanted to set up this optimizer so that it could be reused for any amount of stops I wanted. I've only included 6 in this version to maintain simplicity.

First let's create a table that numbers and labels our stop so we create the following table starting at O3:

Stop

Name

1

Seattle

2

Longmire

3

Darrington

4

Yakima

5

Wenatchee

6

Kettle Falls

Great now let's create our possible routes table. This table will represent the paths I drew out on my map and will also contain information on estimate time (in hours) and number of hikes.

We can see that now the table has all the possible routes I can take and also the estimated time between each path as well as the number of hikes on each path. We can now on B3 add our route selector which will be a binary switch (1 for True, 0 for False) that will represent if a path was chosen or not. For now let's initialize all these values to 0.

Solver will change our boolean values in the Route Select column so an easy way to find out how many totals hours and how many totals hikes I can do in my optimal path I can use the function SUMPRODUCT.

SUMPRODUCT will return the sum of the products of the two ranges I give it in this case (Route Select x Number of Hikes).

So let's label C17 as "Number of Hikes" and insert the following formula in C18:

=SUMPRODUCT(Route Selector X Number of Hikes)

Or in this example

Where Table1 is our possible route tables, Route Selector is the range from B3:B12 and Number of Hikes is the range from D4:D12.

Ok so to recap we have our possible routes table and we still have on O3 our stop numbers and stop names.

Next step is to create the constraint system that is necessary for solving this problem. In simple English we need to create constraints so that solver will only find solutions that follow these rules:

  • I always leave Seattle,WA.
  • I always arrive at Kettle Falls,WA.
  • If I arrive at a certain node the next step has to be leaving that node. (With the exception of Seattle and Kettle Falls).

The easiest way to do this is to create once again a binary selector where +1 indicates arriving at a location and -1 indicates leaving a location. Our new network map would be something like this:

And last we will define our flow column by using SUMIFs. SUMIF will add the cells specific by a given set of conditions. In our case the condition is that all our stops "net flow" are 0 except our starting point and ending point.

=SUMIF(ToStopRange, Stop , Route Select Range) - SUMIF (FromStopRange, Stop, RouteSelect Range).

So on Q3 we will start a range titled "Flow". We will define our flow column by using SUMIFs. SUMIF will add the cells specific by a given set of conditions. In our case the condition is that all our stops "net flow" are 0 except our starting point and ending point.

And we copy this formula down from Q3 to Q6.

Since the flow range is determined by the values in the route selector we still need to give solver a reference range to match the flow I indicated in my map diagram. So adjacent to Flow I will create my constraint column and manually input the values -1 for Seattle, 0 for all my middle nodes and +1 for Kettle Falls,WA.

Setting Up Solver.Finally

To enable solver: http://blogs.msdn.com/excel/archive/2006/09/06/743902.aspx

We have 3 tasks to do when setting up solver:

Set Target Cell: Here we will select the cell right next to number of hikes since we want to maximize this number we select "Max".

By Changing Variable Cells:

The cells to change are our Route Selector Range.

Constraints:

Here we will specify the constraints that we discussed before:

  1. The Flow column must be equal to the constraints column. Select Flow Range = Constraint Range
  2. The Route Select Range has to be >= 0 (we only want positive number).
  3. Route Selector must contain binary values.

And we are finally done with setting up our solver. No we just click "Solve" and done. we now have an optimized hiking travel plan.

If you copied all the number I used you should be getting a maximum of 14 hikes and your ideal travel path is:

  • Seattle to Longmire
  • Longmire to Darrington
  • Darrington to Wenatchee
  • Wenatchee to Kettle Falls

Conclusion:

So now we have built a path optimizer that an easily be expanded (add more nodes) or changed around to optimize other variables like Estimated travel time. Even though this example is fairly simple (a map and some simple arithmetic would have provided faster results) the value of these methods comes into play when analyzing large amount of nodes. Similar principles to those discussed in this post are used for optimizing all sorts of manufacturing, transportation and distribution systems.

Categorías: Excel, Inglés

Using the Open XML SDK to Work With Excel Files

25 Noviembre, 2008 - 18:58

Over on Brian Jones' blog there are two recent articles that may be of interest to our readers here.  They deal with using the Open XML SDK to programmatically interact with Excel's XML-based file format.  One deals with "document assembly", which is essentially building a file mostly from scratch based on other data that is fed from some other system (e.g. a database).  The other deals with using the SDK to read data from an Excel file.  If these scenarios interest you, have a look see:

Document Assembly Solution for SpreadsheetML

Reading Data from SpreadsheetML

 

Categorías: Excel, Inglés

Using PivotTables to Analyze Data From a SharePoint List

6 Noviembre, 2008 - 20:40

Today's author, Diego Oppenheimer, a Program Manager on the Excel team, talks about connecting PivotTables to data stored in SharePoint Lists.

Many of us on the Excel team have been approached by customers asking us how to create a connection to a SharePoint list. SharePoint lists can be exported easily by creating a Microsoft Office Excel Web Query. The connection can then be tied directly to lists or PivotTables in the workbook like any other Data Connection.

I have a SharePoint List that we use to upload and keep track of all project management reports. The list contains information like the project manager that uploaded it, the date, and the name of the reports as well as other automated fields created by SharePoint to make my life easier. Using the filters provided by SharePoint gives me a great way navigate my list quickly and easily but I am more interested in creating a macro view of the team reports so an easy solution would be to connect my SharePoint List to a PivotTable. The fact that my SharePoint List has more than 250 rows makes the use of a PivotTable even more appealing.

Creating the Connection and the PivotTable:

Once we have located the SharePoint list we want to export we go to "Export to Spreadsheet" in the "Actions" menu button.

When the message prompting us if we would like to open or save the Microsoft Office Excel Web Query click "Open" (or you can save for use later). If you haven't started Excel yet this should start it up.

Click "Enable" to unblock the data connection.

If you had Excel already open you will get the the "Import Data" menu. In the case that you did not have Excel open already opening the connection will create a a resfreshable query table in the workbook.

From this menu we can choose to directly create a "PivotTable Report" a report and a "PivotChart" or just a query "Table". Today I am interested in creating a PivotTable so I select "PivotTable Report" and hit "Ok".

Now I can see all my uploaded reports in the PivotTable. To get a better view of what is going on we add the Project Manager field to Rows, Date to the Report Filter (so we can filter by date) and the Name field (the name of each one of our uploaded reports on the SharePoint List) to Values. By adding Name to Values we are essentially creating a count of how many name items we have in the data source.

We replace "Count of Name" with "# Reports" by editing in the formula tab like any other cell.

Now I can easily view the numbers of reports per project manager and filter by date as well as viewing the "Grand Total" number of reports. This is great but the team actually splits up in sub teams by area of expertise so I am going to group them to make this clearer.

My teams are:

We select all the members of the each team in the PivotTable and right click. Select "Group". Now I can rename the Group label to better represent the teams. I name my groups "Finance". "Technology" & "Operations".

Combining the use of our Date filter and the groups we created we can easily see a macro view of all the project management reports.

As soon as new reports are populated into my SharePoint List I will be able to update the PivotTable (right click on the PivotTable then click "Refresh").

Where does the connection live?

Like all other data connections in our work book we can access it by selecting the "Existing Connections" button under the "Data" tab.

We can see that our SharePoint List connection is shown under "Connections in this Workbook" and can now be used to create new PivotTables.

Multiple Microsoft Office Excel Web Queries can be used to monitor more than one SharePoint list at a time as well as to easily analyze the data in them using PivotTables and PivotCharts. The same as with all Office Data Connections you can now use this connection with multiple workbooks and update any connection changes in a single place.

Categorías: Excel, Inglés

Excel Web Application Announced at PDC

29 Octubre, 2008 - 07:42

During today’s Professional Development Conference keynote, Ray Ozzie made an exciting announcement about a new way to view, edit, and collaborate with Excel.  If you weren’t able to watch the keynote, feel free to check out the following links: PDC Keynote Video, Official MS Press Release, and Channel 9 Video.

In case you’re in a hurry, here’s the bottom line:

  • Excel is providing a web application which allows you to view, edit, and do real-time collaboration on spreadsheets in your choice of web browser – Internet Explorer, FireFox, or Safari!
  • The Excel web application will be available in two channels: as a consumer service offered via Office Live and as a business offering via either hosted subscription or volume licensing.
  • Any spreadsheets you author or edit online will be compatible with the Excel desktop client.

I know, I know: “Thanks for the teaser; where’s the app?”

While we will have a limited technical preview later this year, we won’t have a broadly public release for awhile.  As soon as I can share a date, and, more importantly, a way to sign up for the public release, I will post it here.

Categorías: Excel, Inglés

Creating a Master-Detail View in Excel

24 Octubre, 2008 - 05:35

Today's author, Dan Battagin, a Program Manager on the Excel team, talks about joining two tables in Excel (a.k.a. returning multiple rows for VLOOKUP).

Today, we'll take a look at the VLOOKUP function, and work out a way to get around its major drawback - it returns only a single value that matches the lookup.

If you're like me, there are times when you have a big table of data which is pretty well normalized, and you want to pull all of the rows out of that table that match a certain criteria. Well, it's easy to get the first match in that table by using VLOOKUP (in fact, it's easier than ever with the new structured references in Excel 2007):

Figure 1: VLOOKUP is an easy way to return a value out of the first row that contains a customer name.

As you can see above, with structured references in Excel 2007, not only is VLOOKUP easy to use, but it's actually pretty readable - we're matching the value of A9 in the first column of Table1 and returning the value from the 3rd column ($25.00). That's really nice, and very useful (especially if you use VLOOKUP as part of a calculated column in a table) but it does have the drawback that it doesn't ever allow us to return any of the values for the second "Dan" in the list. And, if you're like me, this can be pretty frustrating. I'm constantly working with lists of Excel bugs (yep, we've got some bugs), materials lists for woodworking projects that I'm currently involved in, price lists for whatever current home remodel I'm working on, etc. where I really need to see more than just that first row.

So today, let's look at how we can actually make this work, and retrieve all of the values from a table that match a certain criteria.

Before we get started on the solution, let's lay out the data that we're going to use. Since it's always a pain to come up with data sets for work examples like this, I'm going to use an old standby - the Northwind Sample Database that ships with pretty much every version of Access. Of course, this being Excel, I'm going to copy the portion of the data that I need into worksheets. (Yeah, yeah, I'm a database guy too, and I know we could do the joins in SQL, but that's not always the case, so bear with me.) OK, so here's the data I'm going to copy onto sheets in Excel:

Figure 2: The "Customer" table, inserted on a sheet called Customers, and named "tblCustomers" for easy referencing in the future. Figure 3: The "Orders" table, inserted on a sheet called Orders, and named "tblOrders" for easy referencing in the future. Figure 4: The "OrderDetails" table, inserted on a sheet called Details, and named "tblDetails" for easy referencing in the future.

So you don't have to do this yourself, I've also made a copy of the spreadsheet available so you can just follow along (see the attachments at the bottom of this post). Of course, if you want to walk through this yourself, go right ahead, that's a pretty cool compliment.

So now that we've got all of our source data, I want to put together the actual Master-Detail form. Nothing too fancy, but basically I want to select the Order Number that I want to look at, and see some details about that order (who ordered it, order status, etc.) as well as all of the items that are part of that order (item name, price, quantity, etc.). Pretty standard stuff. Here's our target end result:

Figure 5: The master-details form where I can pick an order number and see all of the details.

In the available workbook, you'll find this form put together already, but if you're doing it yourself, here are the steps you'll want to take (note that these steps skip the formatting of this form, since not everyone likes green the way I do):

  1. Enter the following static strings in the sheet:
    1. B1: Order Number
    2. C3: Order Information
    3. C4: Customer
    4. F4: Order Date
    5. F5: Status
    6. F6: Salesperson
    7. F7: Ship Date
    8. C10: Order Details
    9. C11: Product
    10. D11: Quantity
    11. E11: Unit Price
    12. F11: Discount
    13. G11: Total Price
    14. H11: Status ID
  2. Create a Named Range that includes all of your Order IDs, which we'll use to create the Order ID drop down using Data Validation. To do this, click on the Formulas tab | Define Name and enter:
    1. Name: OrderIds
    2. Refers to: =tblOrders[ID]

      Tip: Data Validation cannot refer to ranges on a different sheet than the one that contains the validation, but by using a named range, you can get values from another sheet (and in this case, get a dynamic list of values using the structured reference to get an entire table column, even if new values are added to it in the future)
  3. Select D1 and name it rngOrderId (we'll use this later).
  4. With D1 still selected, click on the Data tab | Data Validation and create a new List type validation with Source: "=OrderIds". Boom, now you have your drop down.

    OK, now we're going to create some "normal" VLOOKUPS as we discussed above already, with a bit of a twist to make them a bit more robust: we're going to use the MATCH function instead of a hard-coded column number for the value we want to retrieve from the source table.
  5. Select C5 and enter the following function:
    =VLOOKUP(rngOrderId, tblOrders, MATCH(C4, tblOrders[#Headers], 0), FALSE)

    Here we've said find the Order ID that I've selected in the Orders table, and return the value from the column that MATCHes Customer.
  6. Just like C5, we're going to setup the rest of the normal VLOOKUP functions to return metadata about the order - this is all the "Master" data in this Master-Detail form:
    1. C6: =VLOOKUP($C$5,tblCustomers,MATCH("Address",tblCustomers[#Headers],0), FALSE)
    2. C7: =VLOOKUP($C$5, tblCustomers, MATCH("City",tblCustomers[#Headers],0), FALSE) & ", " & VLOOKUP($C$5, tblCustomers, MATCH("State",tblCustomers[#Headers],0), FALSE) & " " & VLOOKUP($C$5, tblCustomers, MATCH("Zip",tblCustomers[#Headers],0), FALSE)
    3. H4: =VLOOKUP(rngOrderId, tblOrders, MATCH(F4, tblOrders[#Headers], 0), FALSE)
    4. H5: =VLOOKUP(rngOrderId, tblOrders, MATCH(F5, tblOrders[#Headers], 0), FALSE)
    5. H6: =VLOOKUP(rngOrderId, tblOrders, MATCH(F6, tblOrders[#Headers], 0), FALSE)
    6. H7: =VLOOKUP(rngOrderId, tblOrders, MATCH(F7, tblOrders[#Headers], 0), FALSE)

Figure 6: The completed Master section of the Master-Details form.

Right, so that's not too bad, and get's us something that's pretty robust, even if we add additional data to our source data tables - just as long as we keep the headings in our form matching the column headings in the tables. OK, so now let's move on to the "Details" part of the form - and this is where we'll get to find a solution to the VLOOKUP limitation of only returning a single value.

Before we build up the actual solution, let's talk about some of the elements of the formula we're going to create:

  1. We're not actually going to use VLOOKUP! Since we need to return multiple items, what we really need is a way to return an array (list) of values, and VLOOKUP just doesn't allow for that. Instead, we'll use the INDEX function to return the value at a specific row and column intersection.
  2. We need a way to designate which item in the array we want to show in the cell, and we'll use a combination of the SMALL and ROW functions to accomplish that.
  3. We want this to be robust in the same way that we made our VLOOKUP function robust - that is, we want to be sure that if we add additional columns to our source data, that these functions don't break. Just like with the VLOOKUP solution, we'll use MATCH to ensure this.

    Tip: A nice side effect of what we're going to do here is that this formula is totally fillable within the "Details" portion of the form, which makes it pretty easy to work with/edit.
  4. Lastly, we want to have a bit of error handling - specifically, we'll use IFERROR (a new function in Excel 2007) to ensure that we just show a "blank" cell if there is an error in the calculation.

OK, so without further ado, let's see that function, as it exists in cell C12 (note that when you enter this function, you enter it without the curly braces, and you press Ctrl+Shift+Enter to commit the cell, which adds the curly braces, making it an array function):

Figure 7: Array function used to return the Nth item that matches a specific value.

That looks pretty complex, so let's take a look at what's actually going on here.  I've pretty printed sections of the formula for easier reading, from the inside out, since that's how Excel will ultimately calculate it:

{
=IFERROR(
          INDEX(tblDetails,
                    SMALL(
                              IF(tblDetails[Order ID]=rngOrderId,
                                   ROW(tblDetails[Order ID])-ROW(tblDetails[#Headers])
                              ),
                              ROW(1:1)
                    ),
                    MATCH(C$11, tblDetails[#Headers], 0)
          ),
          "")
}

Let's look at the SMALL function first.  Small works by taking an array (list) of values, and returns the Nth smallest value from that list. 

                    SMALL(
                              IF(tblDetails[Order ID]=rngOrderId,
                                   ROW(tblDetails[Order ID])-ROW(tblDetails[#Headers])
                              ),
                              ROW(1:1)
                    ),

In this case, the array of values is determined by the IF function.  Specifically, IF the Order ID for a row in the Details table equals the Order ID I've selected in the drop down, then add the row number of that row (minus the row number of the heading row, in case the table doesn't start in row 1) to the array of values.  And, once the full Details table is analyzed in this way, return the 1st item in the array - ROW(1:1) returns 1.

Now, you'll see a couple tricks that we've used here:

  1. Instead of ROW(1:1) to return the first item in the array, we could have just used the value 1 (or the value 4 if we wanted the 4th item in the array).  However, by using the ROW function, Excel will adjust the formula for us as we fill it down a range of cells, so the next row will contain ROW(2:2) for example.
  2. We used structured referencing to make the formula more robust.  We could have had cell references for the tblDetails[Order ID] and for the [#Headers], etc. but those would not have adjusted as the source table was modified.  By using structured referencing, we have a pretty solid solution here. (and more readable)

OK, let's move on to the INDEX function next.  INDEX works by taking a 2d array (table) and returns the value that is at the specified row and column position in that array.

          INDEX(tblDetails,
                    SMALL(
                              IF(tblDetails[Order ID]=rngOrderId,
                                   ROW(tblDetails[Order ID])-ROW(tblDetails[#Headers])
                              ),
                              ROW(1:1)
                    ),
                    MATCH(C$11, tblDetails[#Headers], 0)
          ),

In this case, we know that the entire SMALL function returns the ROW(Nth) value in the list of Order Details that match the Order ID that I entered - so in this case the 1st value, which is the row number that we want to retrieve a column out of in the Details table.  The only thing left then is to specify the column number that we want to retrieve, which as we learned earlier is what MATCH is used for - in this case, taking the value in C11 and finding the column in the Details table that has the same name (Product in this case).

Right, so we see a couple more tricks here:

  1. To specify the 2d array (table of data), we are again using a structured reference - tblDetails - as opposed to defining the array using a range reference (A1:G50 or whatever). This gives us robustness.
  2. Ditto with the MATCH function. We could have used the column number explicitly, but by using MATCH, we can much more easily make changes to the source table without having to worry about whether it will break the rest of our model.
  3. Lastly, you'll see that with the MATCH function, we've made it so that C$11 will adjust across, but will not adjust down - that will come in handy as we want to fill this formula to create our details list.

And with that, all we're left with is the IFERROR function and the array function designators. IFERROR is pretty simple - it basically says: if an error occurs while calculating the stuff I contain, replace the error value with an empty string (""). The array function designators (curly braces) are what allow INDEX and SMALL to work over the entire Details table range.

OK, so now that we've discussed the formula in C12, let's quickly finish up the form (and watch the magic of our robust formulas at work). With C12 selected, grab the little fill handle in the bottom right corner of the cell and drag it to H12.

Ooooohhhhhh. Aaaahhhhhh. Notice how that C$11 auto-adjusted to C$12, C$13, etc. as you moved across.

Next, with C12 - H12 selected, grab the fill handle and fill it down 10 rows or so. When you let go, you'll see, as if by magic, that all of the Order Details appear (as appropriate) and now as you change the Order ID at the top of the sheet, the entire Order Details updates.

Figure 8: The completed details table, with the array formula filled across then down in order to retrieve all details.

Pretty cool huh? We had to create a formula that was a bit more complex than just using VLOOKUP, but we also got a Master-Detail view working in a very robust way, using only built-in Excel formulas (no code, etc.).

Of course, I'm sure there are other, equally slick ways of doing this - if you've got one, let's see it!

Categorías: Excel, Inglés

Don't forget about the "A" in "VBA"!

17 Octubre, 2008 - 19:36

Today's author, Stephane Viot, a Program Manager in the Office Global Experience Platform team, reflects on how leveraging application features could help speed up the development of your VBA solutions, while improving their performance.

Often -but hopefully not always- VBA developers forget about the "A" in "VBA". And by that I mean they might have the tendency of trying to do everything in VBA, like they would in VB, forgetting about what the host application has to offer. To illustrate that point -that is the main purpose of this post- let me share with you a practical example. The problem

I often need to extract the list of unique values from a range -usually from a list or a table- and paste it into a mail as a comma-delimited string where all values are sorted in alphabetical order.

For example, given the following list, how would you pull out the list of unique states you are shipping products to, in alphabetical order, using VBA code -so that you can re-use it over and over?

The VBa-only approach

Before actually getting into the details of a better solution, let's see how one could have tackled the problem, using VBA code only. Even if variants exist, the basic steps would be to:

1. Loop through each cell in the current selection and construct an array of unique values.

2. Sort the array.

3. Build the output string using the list separator provided by the user.

4. Copy the output string to the Clipboard.

When working on very large lists, extracting the unique values from a range using the preceding technique might show its limits, and be quite slow. Sorting the values, too.

Now, think how the application could help you do such things. In other words, what steps would you go through if you had to do it "manually"? The manual steps

This is what I would personally do.

  1. Copy the data from the Ship State column (D2:D49; skipping the header cell), and paste it into a new worksheet or workbook:
     
  2. Click the Data tab in the Ribbon, then Remove Duplicates (under the Data Tools group):
     
  3. In the Remove Duplicates dialog, uncheck My data has headers (provided you haven't copied the "Ship State" column header earlier), then click OK.
     
  4. The results of the preceding operation would look like the following, after having sorted the list in alphabetical order using the Sort button:
     
  5. Now that we have the list of unique values sorted in alphabetical order, how would you get the text "CA, CO, FL, ID, IL, NV, NY, OR, TN, UT, WA, WI" from that list? Again, do not think VBA code and how you could concatenate the content of each cell to build the output string. Think formulae instead. Here is one way of doing it:

    B1: =A1
    B2: =B1&", "&A2

    Then copy the formula in B2 down to B12:
     

    So the results of the calculation would be:
     
  6. As you can see, B12 now contains the string we are looking for. All we have to do next is select that cell and copy/paste it into the e-mail message -and get rid of that extraneous workbook or worksheet that is no longer needed.

On very large lists, I can guarantee that you will be amazed to see how fast Excel Remove Duplicates can be. And I would bet, without taking any risks, that it would be much faster than any VBA code anyone could write.

As you have probably guessed it by now, a better solution to the Vba-only approach would be to mimic those manual steps.

A better solution: The CopyUniqueValues Add-In

Rather than providing the full VBA code listing here (available for download at the end of this post), let's instead see how the add-in is organized.

The CopyUniqueValues add-in contains:

  • One worksheet, internally named shtTemp, for extracting the list of unique values;
  • One UserForm, frmCopyUniqueValues, for selecting the delimiter and sorting options for the list;
  • And one module, modCopyUniqueValues, for performing the "dirty work".

Below are the delimiter and sorting options users can select:

Note: The choice of the delimiter would be determined by the intended use. For example, for building a list...

  • ...to be pasted in a Word document, in place of a table -> select "Comma and space";
  • ...of e-mail addresses to send a message to -> select "Semi-colon";
  • ...that can be used as the source of a Data Validation -> select "Carriage return";
  • ...of column headers for a new table you are creating -> select "Tab"
  • Etc.

The GetUniqueStringsList function, in modCopyUniqueValues, is responsible for creating the sorted list of unique values. It relies heavily on Excel features exposed through its Object Model. It performs the following actions:

  1. Copy/paste append the cell values from the source range to column A in shtTemp, in order to build the complete list of values (still containing duplicates);
  2. Remove duplicates from that list by calling the RemoveDuplicates method.

    Note: The RemoveDuplicates method was first introduced in Excel 2007. In order to make the code work in earlier versions of Excel, you would need to use the AdvancedFilter method instead.
  3. Optionally, sort the list of unique values using the Sort method.
  4. Add the formulae discussed earlier for constructing the output string.
  5. Return the output string from the last cell in shtTemp to frmCopyUniqueValues, where it is copied to the Clipboard using the MSForms.DataObject SetText and PutInClipboard methods.
Conclusion

The intent of this article was to remind (as a piece of advice) VBA developers to always check how the host application could help them solve complex operations, before exercising their coding skills.

When writing Excel macros, I am indeed relying on the application even for the simplest tasks (such as the one below), saving me literally lines of VBA code. For example, I would often favor retrieving a SQL statement I want to use for querying a database from a cell rather than construct it programmatically:

  • Cell A1 would contain the SQL statement "template" e.g., "SELECT FirstName, LastName FROM Contacts WHERE EmailAddress='{0}' ORDER BY LastName, FirstName";
  • Cell A2, the value to use for the "{0}" parameter e.g., "johndoe" (supposedly provided by the user);
  • And cell A3 would have the formula "=SUBSTITUTE(A1,"{0}",A2)", returning "SELECT FirstName, LastName FROM Contacts WHERE EmailAddress='johndoe' ORDER BY LastName, FirstName".

That way, I would simply get the value of the SQL statement from cell A3 and be ready with it.

Categorías: Excel, Inglés

Using Office Data Connection files (.odc) and the DataConnections Web Part in SharePoint to Specify External Data Connections in Newly Created Excel Workbooks.

16 Octubre, 2008 - 00:55

Today's author, Christian Stich, a Program Manager on the Excel Services team, discusses how to enable users to quickly and easily create workbooks using external data connections specified in server based data connection files. Overview

Excel and Excel Services support importing external data, which can be specified using connections embedded in the workbook itself or in Office Data Connection (.odc) files. One of the advantages of using .odc files is that any updates to the external data connection properties only have to be done once in the .odc file itself - any workbook using the .odc file for its data connection will use the updated connection properties. Another advantage of .odc files is that the user can open an .odc file directly, which results in Excel creating a new workbook that already contains the external data connection.

Excel Services can only open .odc files that are located in trusted data connection libraries. For additional information please see White paper: Excel Services step-by-step guides and Plan external data connections for Excel Services Creating an .odc file on SharePoint and making it easily accessible

.odc files are stored in trusted data connection libraries. It is desirable to provide users with an easy means of accessing these .odc files. This can be done by adding the data connection library to the navigation links. An alternative approach is the use of SharePoint's "DataConnections" web part.

Let's get started. We first open Excel and create the new data connection using the "Data Connection Wizard." Please make sure to check the "Always attempt to use this file to refresh data" checkbox - this results in an external data connection that is defined in the .odc file. If the checkbox is not checked, then an embedded data connection would be created instead.

The next step is to upload the .odc file to SharePoint. We click on "Finish", followed by a click on "Properties" and then in the "Definition" tab we click on "Export Connection File" in order to publish the .odc file to a trusted data connection library on the SharePoint server. The path to the connection file next to "Connection file" is updated once the connection file has been uploaded to the SharePoint server.

Note: We would require an additional step if the workbook had an existing embedded data connection that we would want to use. In this case we select the data connection, click on "Properties" and then in the "Definition" tab we click on "Export Connection File" in order to publish the .odc file to a trusted data connection library on the SharePoint server. Once the connection file has been uploaded to the SharePoint server the path to the connection file next to "Connection file" is changed from an empty path (since it was an embedded data connection) to a path pointing to the .odc file in the data connection library. Please note that the "Always use connection file" checkbox could not be checked prior to exporting the connection file. We are now able to check the "Always use connection file" checkbox. However, the .odc file on the server does not have that setting enabled yet. Therefore, we need to export the connection file one more time to SharePoint by clicking on "Export Connection File" and then "Save."

The .odc file that specifies the external data connection is now stored in the trusted data connection library on SharePoint.

Next we edit a page on the SharePoint site. We click on "Add a Web Part", then on "Advanced Web Part gallery and options", and then select and add the "DataConnections" web part. We now have a web part on the page that 'shows' the .odc file.

Users viewing the page can simply click on the .odc file in the "DataConnections" web part. This results in the .odc file being opened in the Excel client.

The external data connection in our example points to the Adventure Works Analysis Services Cube example. Excel opens - the workbook already contains a PivotTable using the Adventure Works cube without any further work on our part. We enable external data and click on "Edit Workbook" and then proceed to customize the PivotTable using the PivotTable Field List displayed on the right. Once we have finished editing the workbook we publish it to a trusted file location on SharePoint and, if desired, add an "Excel Web Access" web part on a dashboard page that uses the Excel Workbook that we just uploaded.

Summary

The "DataConnections" web part can include multiple data connections, providing an easy way for users to locate trusted data connections on the SharePoint site and allowing them to rapidly build workbooks using those data connections. At the same time, updates to external data connections, such as migrating a cube or data base from one server to a different server, only require modifying the .odc files in the trusted data connection library. No changes to the individual workbooks are required.

Categorías: Excel, Inglés

What is the fastest way to scan a large range in Excel?

3 Octubre, 2008 - 22:43

Today's author, Dany Hoter, a Product Planner on the Excel team, talks about some performance characteristics he recently discovered using various methods to manipulate large ranges using VBA.

Problem description

You have a large range in Excel with data. Let's say it contains 100,000 rows and 50 columns for each row (Yes you are using Excel 2007 of course). So altogether you have 5,000,000 cells. Columns A to F have some alphanumeric data that you need analyze and based on the combination of values for each row you need to use the numeric values in G to H to do some calculations and store the results in columns I and J. You could place 200,000 formulas in I and J but you see that a spreadsheet with such a volume of formulas gets very sow and consumes huge amounts of memory.

You decide to try and solve it in a piece of VBA code. The question is how to implement such a task in the most efficient way? What are your options

How can you scan a range in Excel, read the values in some cells, and change some others? Use a range object

Let's assume that the range you want to read starts at A1

The code looks something like this:

Dim DataRange as Range ' Could also be Dim DataRange as Object
Dim Irow as Long
Dim MaxRows as Long
Dim Icol as Integer
Dim MaxCols as Long
Dim MyVar as Double
Set DataRange=Range("A1").CurrentRegion
MaxRows= Range("A1").CurrentRegion.Rows.Count
MaxCols= Range("A1").CurrentRegion.Columns.Count
For Irow=1 to MaxRows 
  For icol=1 to MaxCols 
    MyVar=DataRange(Irow,Icol) 
    If MyVar > 0 then 
      MyVar=MyVar*Myvar ' Change the value 
      DataRange(Irow,Icol)=MyVar 
    End If 
  Next Icol
Next Irow

Use the selection and move it using offset

Many VBA developers learned VBA techniques from macro recording.

When using relative reference the generated VBA code creates statements like:

ActiveCell.Offset(0, -1).Range("A1").Select

As a consequence many developers adopt this technique and use the ActiveCell or selection ranges to move from cell to cell in code and read or write the cell values. The code will look like this:

Dim Irow As Long
Dim MaxRows As Long
Dim Icol As Integer
Dim MaxCols As Long
Dim MyVar As Double
Range("A1").Select
MaxRows = Range("A1").CurrentRegion.Rows.Count
MaxCols = Range("A1").CurrentRegion.Columns.Count
For Irow = 1 To MaxRows 
  For Icol = 1 To MaxCols 
    MyVar = ActiveCell.Value 
    If MyVar > 0 Then 
      MyVar=MyVar*Myvar ' Change the value 
      ActiveCell.Value = MyVar 
    End If 
    ActiveCell.Offset(0, 1).Select ' Move one column to the right 
  Next Icol 
  ActiveCell.Offset(1, -MaxCols).Select ' Move one rows down and back to first column
Next Irow

Use a variant type variable

This technique copies the values from all cells in the range into a variable in memory, manipulates the values inside this variable and if needed moves the values back to the range after manipulation.

Here is the code this time:

Dim DataRange As Variant
Dim Irow As Long
Dim MaxRows As Long
Dim Icol As Integer
Dim MaxCols As Long
Dim MyVar As Double
DataRange = Range("A1").CurrentRegion.Value ' Not using set
MaxRows = Range("A1").CurrentRegion.Rows.Count
MaxCols = Range("A1").CurrentRegion.Columns.Count
For Irow = 1 To MaxRows 
  For Icol = 1 To MaxCols 
  MyVar = DataRange(Irow, Icol) 
  If MyVar > 0 Then 
    MyVar=MyVar*Myvar ' Change the value 
    DataRange(Irow, Icol) = MyVar 
  End If
Next Icol
Next Irow
Range("A1").CurrentRegion = DataRange ' writes back the result to the range

Another difference is that this method is blazing fast compared to the two others. Performance Summary

I compared the three methods on relatively large ranges and here are the results:
 

Method

Operation

Cells/Sec

Variant

Read

1,225,490

 

Write

714,286

 

Read/Write

263,158

Range

Read

250,000

 

Write

1818

 

Read/Write

1,852

Offset

Read

206

 

Write

200

 

Read/Write

203

As you can see using a variant variable is much faster especially when changing cells. Even if the calculation can be done with Excel formulas, in some cases this method is the only one acceptable because using a very large number of formulas can become very slow.

Obviously the one method to avoid is moving the ActiveCell using Offset.

Categorías: Excel, Inglés

Building a Basic, Understandable Sudoku Solver Using Excel Iterative Calculation - Part 2/2

2 Octubre, 2008 - 07:25

Today's author, Charlie Ellis, continues discussing the spreadsheet he built to solve Sudoku puzzles.

In my previous post, I walked through a number of formulas for setting up the valid values and solution board. In this post we'll cover using iteration and other formula tricks to help solve the puzzle.

Using valid values to drive solutions

Looking at cells P22:R24 below, we can clearly see that the only solution here is 7:

Let's reflect this in the solution board.

We're going to modify the "else" argument of the IF function here to make it check to see if the val_bigcell_from_sol has only a single answer, and if so, return that. First we need to make val_bigcell_from_sol:

=INDEX(val_board, ROW(Main!A1)*3-2, COLUMN(Main!A1)*3-2):INDEX(val_board, ROW(Main!A1)*3, COLUMN(Main!A1)*3)

Making sure to enter this from the first cell in the solution board (cell D16).  Next we use this to make the else clause:

=IF(in_cell_from_sol,in_cell_from_sol,IF(COUNT(val_bigcell_from_sol)=1, SUM(val_bigcell_from_sol), ""))

Note the trick here to get a value from a 3x3 big cell; we can't just get its value, but since we know there's only once cell with a value, we can just use SUM to get the value of the one cell within the big cell that has a value.

And then we enter it and. whoops, we get a big scary warning:

Click Cancel, and then go to the tools options and make sure that "Enable iterative calculation" is checked:

Now, when you go back to the spreadsheet, it's worked, and we've completely solved this (really easy) Sudoku:

To see what it's doing, you can dial down the maximum iterations to 1 and add a reset to both the solution board and the valid values board. The way I like to do this is with a cell that tracks state, and either has a string that either says "reset" or "solve". Adding this to a cell (I chose D21) and naming it state, you can modify the valid values board formula and the solution board formula to be:

=IF(state="reset",onetonine,IF(sol_cell_from_val<>"",IF(sol_cell_from_val=onetonine, onetonine,""), IF(solution_in_rcb, "",onetonine)))

And

=IF(in_cell_from_sol, in_cell_from_sol, IF(state="Reset", "", IF(COUNT(val_bigcell_from_sol)=1, SUM(val_bigcell_from_sol), "")))

Respectively. Finding the only cell in a row that can take a given value

Let's take a slightly tougher puzzle now, and see how we can extend this very rudimentary solver.

When you try to solve this puzzle with our existing solver, it gets some answers, but then it bogs down. To solve this puzzle, we need to make use of another rule of solving Sudoku puzzles which is that there must be at least one of each number in every row, column and big box. This rule, which is almost the first rule flipped around helps you get to a solution when there is only one possible place in a whole row, column or big box that a particular number can go.

Before we get to how to actually check for this, let's talk about the right place to put this logic. Because it is helping us to find a solution as opposed to eliminate a possibility, it should go in the solution board. And because it covers a case that is the non-reset-state case where a solution can't be found (and the empty string is returned), the right place is where that normally would be. Furthermore, we have to both find a solution and return that solution, so it makes sense to collapse both of those into one name instead of testing for a solution then doing the same work we did to find the solution in order to return it. That means we're going to change our formula to something like:

=IF(in_cell_from_sol, in_cell_from_sol, IF(state="Reset", "",IF(COUNT(val_bigcell_from_sol)=1, SUM(val_bigcell_from_sol), single_sol_in_rcb)))

Where single_sol_in_rcb is either a solution, if one exists, or the empty string.

To implement this, we'll need val_bigcell_from_sol, but also val_bigrow_from_sol, val_bigcol_from_sol, and val_bigbox_from_sol. These are all fairly straightforward extensions on the concepts from sol_row_from_val and val_bigrow_from_sol, so I won't go into detail here, but here's the formula for val_bigbox_from_sol as entered/viewed from cell D16:

=INDEX(val_board, INT((ROW(Main!A1)- 1) / 3) * 9 + 1, INT((COLUMN(Main!A1)- 1 ) / 3)*9+1):INDEX(val_board, INT((ROW(Main!A1)-1) / 3) * 9 + 9, INT((COLUMN(Main!A1)-1) / 3) * 9 + 9 )

For now, we're going to start out just checking the rows for cases where there's only one cell that can contain a given value. Thus we'll make single_sol_in_rcb just be:

=single_sol_in_row

And then eventually make it so that we can detect a single solution in any of rows, columns, or boxes. BTW, it's fine to create names that make use of these as-yet-undefined names, but if you put these into the sheet, Excel annoyingly resizes your columns to accommodate the #NAME errors that inevitably result.

Here's what this looks like on the valid value board:

Here the five in the big cell highlighted in red is the only five in the entire row, so the corresponding solution cell must be five. We can break this down into two pieces:

  1. Checking that a value exists within the valid values board big cell corresponding to the current solution cell
  2. Checking that there's only one of that value in the valid value board big row corresponding to the current solution cell

The most efficient way I've gotten to work, is to do these checks and comparisons by creating an array of the values in the current big cell and multiplying that array by which of those values only occur once in a given row. Here's the formula for this:

=MAX((COUNTIF(val_bigrow_from_sol,array1to9)=1)*(COUNTIF(val_bigcell_from_sol,array1to9)=1)*array1to9)

Where array1to9 is the array {1;2;3;4;5;6;7;8;9}, which is easiest to get using =ROW($A$1:$A$9). The first COUNTIF and equality operator check whether, for each number 1-9, there's only one of the number in the row. In the above example, this becomes {0;1;0;0;1;1;0;0;0} because each of 2, 5, and 6 occur only once in the highlighted row. The second COUNTIF and equality operator check which values are present in the current cell. In the case of the highlighted cell, this is {0;0;0;1;1;0;0;0;0}. Multiplying them together gets the intersection of those two conditions ({0;0;0;0;1;0;0;0;0}), and multiplying this by array1to9 gives you {0;0;0;0;5;0;0;0;0}, which can be reduced to just the desired value by taking the MAX.

Then, in single_sol_in_rcb, in order to not have to call single_sol_in_row more than once to both return an answer and turn the 0 condition into an empty string, you can use the CHOOSE function like so:

=CHOOSE(single_sol_in_row+1, "", 1, 2, 3, 4, 5, 6, 7, 8, 9)

This is enough to solve the second sample puzzle, but extending this to columns and boxes is actually really easy at this point. All you have to do is make a single_sol_in_col and single_sol_in_box, which are identical to single_sol_in_row except for the first named range. Then you have to adjust single_sol_in_rcb to include the solutions for column and box like so:

=CHOOSE(MAX(single_sol_in_row, single_sol_in_col, single_sol_in_box)+1, "", 1, 2, 3, 4, 5, 6, 7, 8, 9)

Well, that's it. The actual workbook itself is attached to my original post. I think it's neat in that there are only really three different formulas on the actual sheet and another 18 in the names. If you think about it as lines of code, there aren't a lot of places outside of obfuscated code contests where you get 21 lines of code to do as much complicated stuff. I think you'll have to be the judge of how clear the formulas are, but by using the commenting on names and further cleaning up naming conventions, I think this could be very understandable. Certainly there's a lot more I could walk through in a future post - primarily how to add in more complicated strategies that I've gotten to work in this framework such as pairs, triples, hidden pairs, hidden triples, and box-and-line, but also debugging strategies, pitfalls for performance or maintainability, etc. - but that's the basic idea. I'd love to hear thoughts on this approach, other approaches, suggestions for new strategies or additional tricks to make existing stuff simpler or faster. Hope you found it interesting.

Categorías: Excel, Inglés

Building a Basic, Understandable Sudoku Solver Using Excel Iterative Calculation - Part 1/2

30 Septiembre, 2008 - 06:56

Today's author, Charlie Ellis, a Program Manager on the Excel team, shares a spreadsheet he built in Excel for solving Sudoku puzzles.  The spreadsheet can be found in the attachments at the bottom of this post.

For those of you who don't already know, Sudoku is a type of logic puzzle (that I was completely addicted to about three years ago) that requires you to place the numbers 1-9 into a grid obeying certain rules (lots more information on Sudoku is available on the web).

A while back, a fellow PM on the Excel team, Dan Cory, wrote a spreadsheet for solving Sudoku puzzles using Excel formulas and made it available on Office Online (here). Dan's spreadsheet was great in that, unlike many of the Sudoku solving spreadsheets out there, it didn't use any VBA or other scripting to do the work of solving the puzzles, and relied instead on the iterative calculation feature of Excel. It's quite cool and has been a popular download, but one thing about the spreadsheet that I wanted to see if I couldn't improve upon was just how complicated it is. In fact, Dan made every single cell its own different formula, and he ended up having to use VBA to create the formulas because maintaining and debugging it without VBA to write all those different formulas in an automated way was impossible.

As soon as I saw Dan's spreadsheet, I wanted to make my own version of a Sudoku solver that not only used only formulas, but also one where the formulas were relatively understandable and there were a small number of distinct formulas. It turned out to not be that tough to build, but I think I learned a fair amount trying different approaches to the problems of making an iterative model like this one perform well and at the same time be reasonably maintainable and understandable. I think it might even have turned up a reasonably useful way at looking at abstraction within formulas given the Excel formula language. I've always wanted to blog about the process of creating this spreadsheet and about how iterative formulas work to show the power of Excel's formula language, because it illustrates the usefulness of circular references and iterative calculation, and because I just think it's an incredible amount of fun so here goes. Lots of people have created more powerful solvers, many as spreadsheets, some using just formulas, but I wanted to try to explain how you can go about creating a solver and hopefully share some formula tricks that people find useful. Pre-reqs

Creating a spreadsheet for solving a Sudoku isn't entry-level spreadsheeting. In addition to being pretty good with formulas, you'll need to understand the concept of iteration. Chris Rae did a great job of explaining the topic in his earlier post on Iteration & Conway's Game of Life, so I'm not going to repeat that, and I'll simply assume you already understand iteration. Second, we're going to make extremely heavy use of named ranges, and for the stuff I'm doing, the new name manager is very helpful (see Formula building improvements Part 4: Defined Names for some information about this) and I'm going to assume working knowledge of it and of named ranges generally (though I'm going to show some tricks which may be new to even experienced formula users). Finally, you'll need to at least be familiar with array notation in Excel. Setting up the boards

For those I haven't lost already, I'm going to start by creating a series of boards very much like the ones that Dan Cory used: one 9x9 board for my input, one 9x9 board for the solution, and a 27x27 board for the possible values in each box. I do this by changing the row height, column width, font, and zoom such that all the cells are small squares and then applying borders and fills to get the following:

The input and solution boards are reasonably straightforward (the input board is the one in the top left where you'll type in a puzzle to be solved, the solution board is where the correct answer hopefully shows up). The board with possible values, which I'll call the valid values board, is a bit trickier. It is 27x27 because each box in the input and solution boards is represented by a 3x3 set of cells in the valid values board. Each of these nine cells represents whether one of the numbers 1-9 is still in the running to be the actual value for the corresponding box of the solution board and the set of possible values for a given cell in the input/solution cell is the set of all the numbers in a single 3x3 "big cell" that are not blank. If it isn't already, the purpose/use of this board should become clear later. For now, let's fill in all the possible values from one to nine in each of these big cells. Filling in the valid value board

We want to do this by creating a single formula that will fill in the various numbers 1-9 based on which row and column the formula sits in, and then we'll later add logic to blank out the numbers that aren't valid. This formula is a little more complicated than the average spreadsheet formula, so I'll first give the whole formula and then break it down. This looks like the following:

= MOD(COLUMN(A1)-1,3) + 1 + MOD(ROW(A1)-1,3)*3

When this is entered into the top-left cell of the valid values board and then filled into the entire valid values board, it gives the following results:

Note that you'll want to do the filling in with either Paste Special | Formulas or CTRL-Enter because otherwise you'll mess up all the pretty formatting.

Breaking this formula down, ROW and COLUMN return (duh) the row or column of the reference passed to them as a number. Passing these functions A1, as in this formula, means they'll give us a number that starts at one and goes up. The first part of the formula uses the modulus function to transform the column numbers given by COLUMN into the numbers 0-2, and then adds one to get 1-3. To this we add a 0, 3, or 6, depending on the row number by using the modulo function on the result of the ROW function.

Next, because that's a bit of a gnarly formula to have sitting around, and we're going to have to use it all over the place, we're going to take this formula and move it out of the cell and into a named range. This allows us to abstract away all of the logic for this formula into a single, understandable name. For lack of a better name, I'm going to call it "onetonine" and it will have the same exact formula we just created. Because the context for the relative references (i.e. what they take as being the current cell) is determined by what cell you're in when you create the named range, it's critical that you start off by selecting cell A1, then create the new named range, so that your formula works everywhere within the sheet.

 

This is also why we allow gutters of three rows and three columns around all the boards.

Now we can take our new name and test it out in the board, like so:

Here CTRL+Enter is by far the easiest way to set the formula for all the cells in the valid values board. First select the whole board, then type in the formula, and instead of pressing Enter, just hit CTRL+Enter to fill the formula you just typed into all the cells (without messing up their formatting). Setting up the solution board

We're going to want to base what valid values are left for a given box on what our current solution looks like (as opposed to the input), but in order to do that, we need something in the solution board. To begin with, at least, the solution will definitely contain all of the numbers in boxes from the input board. Let's start off by doing this in the simplest possible way, while catching the case of blanks. In the solution board, let's make the cells there all simply equal the corresponding cell in the input board using relative references unless the input cell is blank. The absolute easiest way to do this is with the following formula (shown in the form in which it would be entered into cell D16):

=IF(D4,D4,"")

Again, use CTRL+Enter to fill this into the appropriate cells. Now that we have the base thing working, let's make it more re-usable and meaningful by using named ranges.

As we did with the name onetonine, let's abstract the concept of referring to the correct input cell from any cell in the solution board and turn that into a name. We'll need to do something similar for all the boards at some point, so we'll start by making named ranges for each of the boards (I chose in_board, sol_board, and val_board) and then a name to go from the solution board to the input board (in_cell_from_sol) which is simply =Main!D4, then use this to change the formula to be =IF(in_cell_from_sol, in_cell_from_sol, ""). Note that this needs to be input from D16.

OK, so far we just made our formula longer, but trust me, this concept becomes a life saver. Doing the same for valid value cells from solution board cells is only a bit trickier. The name sol_cell_from_val is:

=INDEX(sol_board, INT((ROW(Main!A1)-1)/3)+1, INT((COLUMN(Main!A1)-1)/3)+1)

This must be created from cell P4. This formula uses ROW and COLUMN together with the division operator and INT to convert from the coordinates of the current cell in the 27x27 board to their coordinates in a 9x9 board, then uses INDEX to get the cell out of the sol_board corresponding to those coordinates.

A neat way of testing this formula is to click into the "Refers to" box of the name manager from different cells in the valid values board. Depending on what cell you're in you'll see "dancing ants" (a moving highlight) for a different cell - hopefully the corresponding cell in the solution board.

Now that we have some basics, let's put in an actual puzzle and see about getting the inputs to propagate to the solution board and the valid value board. Here's the puzzle we'll use:

After entering it, the solution board should look like the input board. To make the valid value board work, we use this formula for all valid board cells:

=IF(sol_cell_from_val<>"",IF(sol_cell_from_val=onetonine, onetonine,""), onetonine)

This means that the current cell is blanked out if a value exists in the solution cell and that value isn't the current onetonine value.

This should give you:

Now we're ready to do the stuff that will actually help come up with solutions based on the rules and strategies of Sudoku. Checking for a number in the rows of the solution board

The main rule of Sudoku is that you can't have two of the same number in any row, column, or 3x3 big box. We will start by adding the rule that there can't be more than one of a number in any row and then working in columns and big boxes. For example, in the second big cell in the first row, none of the numbers 4, 2, 7 or 9 are possible as a result of this rule. We can do this by turning blank any cells in the valid values board for which 1) a solution for the box doesn't exist (which is precisely when we get to the final onetonine in the formula for valid value cells) and 2) the row of the solution board contains the number equal to the current value of onetonine. Note that condition #1 is precisely where the last onetonine shows up (i.e. no solution exists for the current big cell), so all we have to do is put the logic for #2 there. This logic can be expressed as:

=IF(COUNTIF(sol_row_from_val, onetonine)>0, "", onetonine)

Where sol_row_from_val is:

=INDEX(sol_board, INT((ROW(Main!A1)-1)/3)+1, 0)

Again, this must be entered from P4.

So combining these we get:

=IF(sol_cell_from_val<>"",IF(sol_cell_from_val=onetonine, onetonine,""), IF(COUNTIF(sol_row_from_val, onetonine)>0, "",onetonine))

Which, while not simple, is at least understandable and gives you a valid values board that looks like this:

Extending to columns and (3x3) big boxes

When we go to add the rules for "no two of the same number in a column" and "no two of the same number in a big box" in this same way, we will run into two problems: 1. you can't create sol_bigbox_from_val directly using INDEX because INDEX only returns a cell, row, or column from a range or the whole range and 2. it will start to get unwieldy to have all three of the COUNTIFs OR'd together at the end of this formula.

To solve the first problem, you can use OFFSET - as you could use OFFSET to create any of the other references here - but because OFFSET is volatile this will lead to performance problems down the road. A better solution is to take the union of two references that you get from INDEX (using the union operator in Excel - the colon) in order to make a 3x3 range. This gives us a sol_bigbox_from_val with the following formula (entered from P4):

=INDEX(sol_board, INT((ROW(Main!A1)-1)/9)*3+1, INT((COLUMN(Main!A1)-1)/9)*3+1):INDEX(sol_board, INT((ROW(Main!A1)-1)/9)*3+3, INT((COLUMN(Main!A1)-1)/9)*3+3)

By now we can pick this formula apart more easily. The INT, ROW, division part says that for every nine rows you move in the valid value board, move down by a block of three rows in the solution board. There's a similar expression for columns that accomplishes much the same thing moving across. The second reference is precisely the first reference, but offset two rows down and two columns across, giving you a 3x3 box.

Now that we have this, we could write one big formula that covers whether the current onetonine value already exists in any of the row, column or big box, but let's use abstraction again here to keep the fundamental formula of the valid value board more simple. Instead of putting it directly into the formula, let's invent a new name called solution_in_rcb for "does there exist a solution cell with my number in any of the row, column, or bigbox?" This name only ever has to return true or false (doing the test part of the condition #2 does above) and despite not being short, is actually really simple to write:

=OR(COUNTIF(sol_row_from_val, onetonine)>0, COUNTIF(sol_col_from_val, onetonine)>0, COUNTIF(sol_bigbox_from_val, onetonine)>0)

Taking advantage of this new name makes our new formula for valid value cells:

=IF(sol_cell_from_val<>"",IF(sol_cell_from_val=onetonine, onetonine,""), IF(solution_in_rcb, "",onetonine))

Which is not only shorter than this formula had been and much more understandable, it also results in some clear places where there's only one possible solution:

So we can eyeball some solutions, but the trick now is to feed those into the solution board. This is where iteration comes in. Next time we'll use iteration and a few more formula tricks to solve some Sudokus.

Edit: Updated the sol_bigbox_from_val formula to reflect what it looks like when entered from the starting cell in P4.  Also clarified in a couple other places that the starting cell should be P4.

Categorías: Excel, Inglés

Chart Advisor

25 Septiembre, 2008 - 19:01

Today's author, Scott Ruble, a Program Manager on the Excel team, is seeking feedback on a new prototype for building charts.

When Office 2007 was released, one of the strong pieces of feedback was Excel needs to do a better job guiding users in the proper selection of charts to effectively communicate their data. Though it wasn't our intent, some of the new formatting options such as glow and legacy 3D charts can be used inappropriately, which obscure the meaning of a chart. Some people felt that these features contributed to creating more "chart junk." In an effort to improve this situation, we have created a prototype called the Chart Advisor. The Chart Advisor intelligently suggests different chart types based on an analysis of your data in Excel 2007. Depending on the feedback we get on Chart Advisor, we may incorporate this as a native Excel feature in a future release. For a quick tour of Chart Advisor, please check out the video or read the steps below and give it a try.

";" mce_src="https://blogs.msdn.com/blogfiles/excel/WindowsLiveWriter/ChartAdvisor_9AED/video48b57581b760.jpg" galleryimg="no">

Video: Chart Advisor Overview

Background

Chart Advisor was developed by Office Labs which is a new organization within Office that focuses on concept validation and incubation. Over the course of several months, we compiled a set of well defined heuristics such as when to use a line chart versus a column chart. These heuristics were incorporated into an advanced rules engine which scans your data and scores all of the relevant chart types. Top scoring charts are presented for previewing, tweaking, and inserting into your Excel worksheet. The rules used for scoring can be complex and we know that some rules are missing or need fine tuning. Given the breadth of possible chart types and data sets this will probably always be the case - even for a fully functional shipping version of Chart Advisor. To address this, the rules were separated from the rules engine into an XML file which can be modified by you to meet your individual business needs. Other notable features of the Chart Advisor are shown below.

  • Automatic pivoting to generate more useful charts
  • Hidden rows and columns are ignored in the spreadsheet
  • Totals for columns or rows are ignored in the charts so as not to skew the data plotted
  • Ability to quickly change the data displayed in the chart through the Modify chart pane
  • Ability to filter data
  • Intelligent mapping of data to the axes
  • Scoring of charts to show the best charts first
  • Chart formatting (legend on/off/location, chart style) is rules-based - e.g., no legend if only one value series

The vision for Chart Advisor was for it to address four different areas of recommendation:

  • Chart type (e.g., column, line, scatter, etc.)
  • Formatting (e.g., fill, color, effects, etc.)
  • Chart element (e.g., legend, trendline, error bars, etc.)
  • Aggregations and pivots

The Chart Advisor is a proof of concept. As such, it doesn't address all aspects of our vision just yet. Currently it focuses on providing chart type suggestions and to a more limited extent formatting, chart element, aggregation, and pivot suggestions. However, I'd love to get your feedback on other ways you think the Chart Advisor could add value. Steps to Use

Using the Chart Advisor is pretty straight forward. As noted previously, please check out the video for a quick tour or use the following steps.

Sample Table of Data

Region

Year

Sales

Market Share

CA

1980

$5,855

11%

OR

1980

$10,839

21%

WA

1980

$15,302

29%

ID

1980

$20,231

39%

CA

1985

$5,175

10%

OR

1985

$9,905

20%

WA

1985

$14,248

28%

ID

1985

$21,040

42%

CA

1990

$10,151

14%

OR

1990

$15,140

21%

WA

1990

$20,436

28%

ID

1990

$26,886

37%

 

 

 

 

 

  1. Install Chart Advisor.
  2. Copy the data from the sample table above into an Excel 2007 workbook.
  3. Select the data that you would like to chart. If you select all of the data in the sample table, Chart Advisor will analyze all of the rows and columns and provide suggestions based on all of data and subsets of the data. This is useful when you don't have any idea how to visualize the data or want to explore a variety of possibilities. If you would like a more narrow view of the data, select just those columns. For example, select the columns for Region and Sales.
  4. From the "Insert" tab in Excel 2007, click on the "Chart Advisor" button in the ribbon. This will launch the Chart Advisor and start the data analysis process. Note that this analysis may take a long time for large data sets. As noted previously, Chart Advisor is a prototype and does not have the performance of a truly native feature in Excel. The performance would significantly be improved if Chart Advisor shipped with a future version of Excel.
  5. After the Chart Advisor dialog comes up, hover over the recommendations at the top of the dialog. This will show a live preview of the suggested chart in the main window. To better understand the scoring, hover over the percentage in the lower right corner of the chart thumbnail. For the sample data, notice that the rules engine detected repeating rows for the Region column and suggests several aggregated views of the data.
  6. Modify the chart data as you see fit by turning on or off fields in the "Modify Chart" panel on the left. You can also filter the data by using the "Filter Data" panel.
  7. Once you are done modifying the chart, click the "Insert Chart" button in the lower right corner of the dialog. Note that you won't be able to re-enter this dialog to make changes after clicking the insert button. Again, this is a prototype with some limitations.
Caveats

Our intent was to make the Chart Advisor functional enough for experimentation and validating the concept. As such, it does have several limitations which I've mentioned previously but I'll summarize here.

  • Not all aspects of the intended vision have been addressed. The focus for this first prototype was primarily on chart type recommendations.
  • The rules engine needs fine tuning for some data sets. As such, it may suggest inappropriate chart types periodically.
  • For large data sets, the performance may be slow.
  • There may be bugs in the prototype that limit its use. The prototype has been tested but not to the degree that a typical shipping product would receive.
Feedback

As mentioned previously, we would love to get your feedback on the Chart Advisor. Though it isn't fully functionally just yet, conceptually speaking would it meet your needs, what changes would you like to see, and what other ways could it add value to you or your company? Please give it a try and tell a friend.

Categorías: Excel, Inglés


necesitomas.com en tu página principal de Google